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Abstract. Disordered systems are an important class of models in sta- 
tistical mechanics, having the defining characteristic that the energy 
landscape is a fixed realization of a random field. Examples include var- 
ious models of glasses and polymers. They also arise in other areas, like 
fitness models in evolutionary biology. The ground state of a disordered 
system is the state with minimum energy. The system is said to be 
chaotic if a small perturbation of the energy landscape causes a dras- 
tic shift of the ground state. We present a rigorous theory of chaos in 
disordered systems that confirms long-standing physics intuition about 
connections between chaos, anomalous fluctuations of the ground state 
energy, and the existence of multiple valleys in the energy landscape. 
Combining these results with mathematical tools like hypercontractiv- 
ity, we establish the existence of the above phenomena in eigenvectors 
of GUE matrices, the Kauffman-Levin model of evolutionary biology, 
directed polymers in random environment, a subclass of the generalized 
Sherrington-Kirkpatrick model of spin glasses, the discrete Gaussian free 
field, and continuous Gaussian fields on Euclidean spaces. We also list 
several open questions. 
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1. Introduction 

Let us begin with a motivating example. Let g = [gv)v&'^ be a collection 
of i.i.d. standard Gaussian random variables. A (1 + l)-dimensional directed 
polymer of length n is a sequence of n adjacent points in Z^, beginning at 
the origin, such that each successive point is either to the right or above the 
previous point. The energy of a polymer p = (fo, . . . , Vn-i) in the Gaussian 
random environment g is defined as 

n-l 

E{p) ■.= -Y,9v,- 
1=0 

The 'ground state' of the system is the polymer path with minimum energy, 
which we denote by P. One of the main goals of this paper is to understand 
a particular feature of the ground state, known as the chaos property. It 
says, roughly, that a small perturbation of the environment gives rise to a 
new ground state that is almost disjoint from the original one. 

There is a standard way to define a perturbation of a Gaussian envi- 
ronment. If g' is an independent copy of g, and we define the perturbed 
environment g* := e~*g + \/l — e~^*g', then g* is again a standard Gaussian 
random environment. The parameter t is a measure of the amount of pertur- 
bation. This definition arises naturally from running an Ornstein-Uhlenbeck 
flow at each vertex for time t. 

Formally, the property of chaos of the ground state means that there 
exists tQ{n) such that to{n) and supj>jjj(„) E|Pn P*| = o(n) as n — > oo, 
where P* is the minimum energy path in the environment g*, and |PnP*| is 
the number of vertices common to the two paths. The definition of 'almost 
disjoint' in this way makes sense, because P and P* are both of length n. 

Although this is a widely studied phenomenon in the theoretical physics 
literature on directed polymers (see e.g. [38], [83], [52], ^25j, [39j, [66j), there 
are no rigorous results. Using the techniques of this paper, we can prove the 
following theorem. 

Theorem 1.1. Fix n, and let Iq = (logn)~^/^. Then for all t > t^, 

Cn 

E|PnP*| < . 

ylogn 

where C is a universal constant. 

Of course, this result only proves chaos in principle. The factor of \J\ogn is 
too slowly growing to be of any practical significance, even when n is of the 
order of the Avogadro number. 

The key to our approach is a seemingly new connection between the fiuc- 
tuations of the ground state energy and the stability of the minimum energy 
path. Suppose r is an Exponential random variable with mean 1, indepen- 
dent of all else. Then we have the relation 

Var(min£;(p)) =E|PnP^|. 
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This comes as a consequence of a far more general result, that we are going 
to describe in the following pages. Given this formula, there are still two 
tasks remaining: (a) to show that the variance is o(n), and (b) to prove 
a Tauberian theorem that extracts a bound on K\P n P*| for fixed t from 
a bound on E\P n P'^\. We carry out task (a) in Section [8] and task (b) 
in Section [3] (more specifically, via Theorem 13.21 which is one of the main 
results of this paper). 

Let us now present our general framework, that encompasses the polymer 
model as a special case. We treat the energy landscape of the model as a 
giant Gaussian random vector, and prove general theorems about Gaussian 
vectors that imply results like Theorem 11.11 There are many other exam- 
ples that fall into this framework; besides polymers, the ones that have been 
treated in this paper include spin glass models, random matrices, fitness 
models of evolutionary biology, the discrete Gaussian free field, and contin- 
uous Gaussian fields on Euclidean spaces. 

Let 5 be a finite set and X = {Xi)i^s be a centered Gaussian random 
vector with possibly dependent coordinates. (In the context of the polymer 
example, think of S as the set of all directed polymers of length n starting 
at the origin, and Xi = —E{i) for a path i £ S.) Let 

R{i,j) := Cov{Xi,Xj), := maxVar(Xi). 

i&S 

(Again, for polymers R{i,j) = \i H j\, and cr^ = n.) We will often refer to 
X as a 'Gaussian field'. The elements of S will be alternately called 'states' 
or 'indices' or 'sites' or 'coordinates'. The two central objects of interest in 
this paper are (i) the maximum of the Gaussian field X, 

M := maxX,-, 

and (ii) the location of the maximum, 

/ := argmaXjg^Xj. 

To ensure that / is well-defined, we assume the non-degeneracy condition 

(1) F{Xi / Xj) = 1 for each i / j. 

We study M through its mean and variance 

m:=E(M), w:=Var(Af). 

Let us alert the reader that we will use the symbols X, M, m, v, R{i,j), and 
/ throughout the paper to mean what they stand for here, often without 
explicit reference to the above definitions. 

The following is a well-known result about the fluctuations of Gaussian 
maxima. 

Proposition 1.2. Irrespective of the correlation structure of the vector X, 
we always have v < a"^. 
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In words, this means that the order of fluctuations of the maximum cannot 
be larger than the order of the fluctuations of the most fluctuating coor- 
dinate. This inequality was proved by Houdre [36], although the method 
of proof seems to be implicit in the much earlier work of Nash [55], and 
the works of Chernoff [18], Chen [17], and Houdre and Kagan [37] on the 
so-called Poincare inequality for the Gaussian measure. 

There is a famous 'advanced version' of the Poincare inequality, called 
the Gaussian isoperimetric inequality, independently invented by Borell ^12j 
and Sudakov and Tsirelson [70] , that gives tail bounds instead of simply a 
variance inequality. A striking consequence of the isoperimetric inequality 
is the following result of Tsirelson, Ibragimov, and Sudakov [81]. 

Proposition 1.3. For any r > 0, 

F{M -m>r) < e-^^ 
and the same bound holds for P{AI — m < —r) as well. 

Although the above result is often called 'Borell's inequality', it is clearly 
not a fair nomenclature. Since it is too cumbersome to call it the 'Borell- 
Tsirelson-Ibragimov-Sudakov inequality', we will simply refer to it as Propo- 
sition [L3l in this manuscript. 

Note that although Proposition 11.31 is a deep and powerful result, concep- 
tually it does not say a lot more than Proposition 11.21 since it implies, just 
as Proposition 11.21 that the fluctuations of the maximum can be at most of 
order cj^. In particular, it is a crude worst case bound that does not use the 
correlation structure of X. However, this is all that one can obtain from the 
classical theory of concentration of measure (see e.g. Ledoux [47 \ ) . 

Here is where our investigation begins. What happens if v is very small 
compared to cj^? As we will see, this is in fact the rule rather than the 
exception in interesting examples. The main point of this paper is that 
the condition t; <C cr^ ushers in a whole host of interesting structure on 
the field X. Indeed, the structure is so interesting and pervasive that the 
condition seems to deserve a name of its own. When it happens, we will say 
that the Gaussian field X exhibits 'superconcentration'. The notion can be 
precisely defined only in terms of a sequence of Gaussian fields rather than 
a single one. Accordingly, let (X„)„>i be a sequence of centered Gaussian 
fields, where X„ is defined on a finite set S„. Let 

al := maxVar(X„i), M„ := maxX„,i, m„ := E(M„), t>„ = Var(M„). 

Definition 1.4. We say that the sequence of Gaussian fields (X„)„>i 'has 
superconcentrated maximum' or simply 'is superconcentrated' if Vn = o(cr^) 
as n ^ oo. 

Here, as usual, an = o(6„) a-n/bn = 0. In practice, 

we will simply say that X is superconcentrated if it is implicitly the nth 
member of a sequence of fields having superconcentrated maximum. We will 
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give many examples of superconcentrated Gaussian fields in the subsequent 
sections to demonstrate the 'rule rather than exception' claim. 

Incidentally, physicists often refer to the superconcentration phenomenon 
as 'anomalous fluctuations' (see e.g. [39j). However, it is not a well-defined 
notion (in particular, they don't connect it with classical concentration, and 
'anomalous fluctuations' can also mean larger fluctuations than usual); we 
feel that our terminology is more evocative and precise. 

Let us now describe some of the consequences of superconcentration. A 
summary of the results is contained in Theorem 11.81 but we first need to 
define some concepts. 

An important property of Gaussian fields that has been studied in var- 
ious special examples by physicists but has almost no presence in rigorous 
mathematics, is the property of chaos. We have already had a discussion of 
this in the context of polymers, so let us now make a general definition. Let 
X' be an independent copy of X. For each t £ [0, oo), let 

X* := e-*X + - e-2*X'. 

Note that X* has the same distribution as X, that is, the transformation 
X ^ X* is a distribution preserving perturbation of X. We will be mostly 
interested in small perturbations, i.e., small t. As mentioned before, this 
is a natural way to define perturbations of Gaussian fields because of its 
intimate relation to Ornstein-Uhlenbeck diffusions. Let I* be the state at 
which the maximum is attained in X*, that is, 

I* := argmaXjgcjX*. 

Note that /* is well-defined by assumption ([T|), and that = /. We will 
say that the field X is 'chaotic' if /* is highly unstable, that is a small 
change in the value of t causes, with high probability, a drastic change in /*. 
There are, of course, various notions to be made precise here. First of all, 
what is meant by a drastic change in /*? As we will see, in most examples 
two states i,j £ S are 'drastically different' if R{i,j) is very small, typically 
R{i,j) <. cr"^. Thus, we may formulate chaos for /* to mean that for t = o(l), 
E(i?(I^,/*)) <C 0"^. Secondly, what is a 'small change in f'? This we will 
take at face value, i.e. small means small, in relation to nothing else. 

However, none of this is precise. As before, the only way to make a 
completely meaningful definition is via sequences. 

Accordingly, let (X„)„>i be a sequence of Gaussian fields as in Defini- 
tion 11.41 Let X^ be an independent copy of Xji and define, as above, the 
perturbed fields 

X*„ := e-*X„ + v'l-e-2tx;. 

Again, as above, let /* = argmaxjg5^ •. Let Rn be the covariance kernel 
of X„. We are now ready to give a precise definition of the chaos property. 
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Definition 1.5. We say that the location of the maximum in X„ exhibits 
'chaos ' or simply that X„, exhibits chaos if there is a sequence tn ^ such 
that ERn{In, In') = as n ^ oo. 

Note that here we defined chaos in terms of the decay of Ei?„(/^, /^"), instead 
of sup^x^ Ei?„(I^, as we did for the polymer example. It turns out 
(Theorem I3.2p that Ei?„(/^,I^) is always a decreasing function of t, and 
therefore the two definitions are equivalent. 

Next let us turn our attention to the so-called 'multiple valley picture'. 
Often, we have the situation that a Gaussian field X has many 'drastically 
different' sites at which the global maximum is nearly achieved. It is called 
'multiple valleys' instead of 'multiple peaks' because the physicists like to 
put a minus sign. We will, however, call it the multiple peaks phenomenon 
to avoid any confusion. As before, we attempt to give a precise definition 
via sequences of fields. All notation is the same as before. 

Definition 1.6. We say that X„ has multiple peaks (MP) if there exist 
In — *■ oo, e„ = o(c7^), 6n = o(m„) and 7n — > such that with probability at 
least 1 — 7„, there is a set A O Sn satisfying 

(a) 1^1 = /„, 

(b) RniiJ) < e„ for each i,j £ A, i^ j, and 

(c) Xi > Mn — Sn for each i £ A. 

Note that the condition dn = o{mn) is natural, because Xi is nearly maxi- 
mum if Xi = Mn — o{Mn). However, this is not the form that is conjectured 
in the physics models. Indeed, since the fluctuations of M„ are of order ^/v^, 
the physicists seem to think that multiple peaks, in the above sense, should 
occur whenever (5„ » y/v^, or at least when 5n = o((T„). This leads us to 
the definition of a stronger notion of multiple peaks. 

Definition 1.7. We say that X^ has strong multiple peaks if the multiple 
peaks condition is satisfied with 5n = o{an) instead of 6n = o{mn)- 

The first main result of this paper, stated below, shows that the properties 
of superconcentration, chaos, multiple peaks, and strong multiple peaks are 
all intimately related to each other. This may not be surprising from a 
physicist's point of view, but this is the first time that these widely observed 
phenomena have been formulated and connected by rigorous mathematics. 

Theorem 1.8. For any sequence of Gaussian fields (X„)„>i satisfying the 
non- degeneracy condition ([1]) we have 

Strong MP , Superconcentration <^=^ Chaos. 



Moreover, under the 'positivity assumption' that Rn{i,j) > for each n and 
i,j £ Sn, we have the more complete picture: 

Strong MP , Superconcentration <;=^ Chaos , MP. 
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The counterexample that shows chaos 7^ strong MP is particularly surpris- 
ing and goes against common intuition. It shows that, contrary to what 
one may think, chaos is not necessarily caused by the existence of multiple 
peaks. 

Theorem 11.81 is stated as a limiting result, but we do have precise quan- 
titative bounds for all parts of the theorem. These will be presented in 
Section [31 where we give the proof of Theorem 11.81 One result from Sec- 
tion [3] is worth mentioning here, since it provides the foundation for all 
subsequent work by connecting the value of the maximum with the location 
of the maximum in a Gaussian field. It is Lemma [3. H which says that if r is 
an Exponential random variable with mean 1, independent of all else, then 
we have the formula 

i; = E(i?(/°,r)). 

We have already stated the version of this formula for polymers. The proof 
of this identity would be easy for any expert on the Ornstein-Uhlenbeck 
semigroup — indeed, it is implicit in the classical proofs of Propositions 11.21 
and ll.31 — but the author has not seen it explicitly written down anywhere in 
the published literature. Interestingly, it was brought to our notice that the 
identity does make an appearance (in a slightly different form) in a recent 
manuscript of Nourdin and Viens [58] that was prepared at the same time 
as the first version of this paper was being written. 

In spite of its cuteness the identity is not very useful on its own, since we 
are interested in E,{R{I^,P)) for fixed t. This question is handled in Theo- 
rem [321 where we apply a Tauberian argument to the above representation 
of V to show that for each t, 

< E(ii(/°, /*)) < — and 
1 — e ' 

v<a\l-e-')+E{R{I°,l'))e-'. 

It turns out that these upper and lower bounds suffice to show the equiva- 
lence of superconcentration and chaos (we show this in Section [S]). 

Before presenting further results, let us discuss some of the literature. As 
we mentioned before, the phenomena of superconcentration, chaos, and mul- 
tiple peaks have not been systematically studied in the mathematics litera- 
ture, so there are essentially very few references. The closest thing to chaos 
in the domain of rigorous mathematics is the notion of noise-sensitivity, 
although that refers mainly to correlations between functions. The litera- 
ture on noise-sensitivity in computer science is sizable; it mostly involves 
sensitivity of scalar functions of Boolean random variables to random noise, 
which is not so relevant to us. In the probability world, a very notable paper 
on the subject is due to Benjamini, Kalai, and Schramm [6j. A subsequent 
paper [7j by the same authors is more relevant for what we do in this article. 

One truly significant contribution to what we call superconcentration is 
due to Talagrand ([71j, Theorem 1.5), which has been the source of many 
applications (including ^). Talagrand's result provides a way to improve 
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variance bounds like Proposition 11.21 under certain situations by a 'factor of 
logn'. Talagrand's breakthrough idea was to use the tool of hypercontrac- 
tivity (discovered by Nelson [57]) to improve variance bounds. We will use 
this method, in conjunction with ideas from Benjamini-Kalai-Schramm [7] 
and our Theorem 11.81 to prove the existence of chaos in directed polymers 
in Section El 

Another contribution of Talagrand in our context is the proof of strong 
MP superconcentration, which follows essentially from a sketch at the 
end of Section 8.3 in his landmark paper [72]. Unfortunately, the author has 
not yet been able to find a use for this remarkable implication, since proving 
strong MP seems to be always more difficult than proving any of the other 
phenomena. 

In the physics literature, there is a long, folklorish history of studying the 
phenomena of superconcentration (via 'fluctuation exponents'), chaos, and 
multiple valleys. For instance, the implication that superconcentration ^ 
chaos is the central theme of Fisher and Huse [25], who investigated it in 
the context of directed polymers in a random environment. Examples of 
other highly cited works in this area are those of McKay, Berger, and Kirk- 
patrick [50j, Huse, Henley, and Fisher [38], Bray and Moore [13], Zhang [83] 
and Mezard [52] . 

Let us now return to the discussion of our results. As we mentioned 
above, the only rigorous tool available at present that can establish super- 
concentration is Talagrand's method of using hypercontractivity to improve 
variance bounds. Although this works in many situations (including some 
of our examples in this paper), it gives only a 'logn correction', and usually 
does not suffice to break the barrier of 'improvements in powers of n'. 

One of the main goals of this work is to break this wall by finding an 
alternative technique. We have only had partial success in this direction, 
but what we have may lead to further progress. Under a certain condition 
that we call 'extremality', we are able to get improvements in powers of n 
in highly nontrivial models like certain cases of the generalized Sherrington- 
Kirkpatrick model of spin glasses. The success is 'partial' because, for in- 
stance, we are not able to cover the original 2-spin SK model. 

The notion of extremality of a Gaussian field like X is defined as follows. It 
can be shown (see Lemma [2. ip that irrespective of the correlation structure, 
we have m < \J'lo'^ log with near equality if the coordinates are i.i.d. 
iV(0,cr2). We say that the field X is 'extremal' if m ~ y^2cj2 log|5|. Of 
course this makes sense only if we consider sequences. 

Definition 1.9. We say that (X„)„>i is extremal if 



y^2al log|S„| 

Note that although extremality may seem to indicate that the coordinates of 
X„ are approximately independent for large n, that is not true. Extremality 
can hold even in models with high degrees of dependence between sites, like 
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the Gaussian free field (proved by Bolthausen, Deuschel, and Giacomin [Uj), 
branching random walks (proved by Biggins [9]), and some nontrivial spin 
glasses (treated in Section [9]). The second main result of this paper, proved 
in Section [SI is the following. 

Theorem 1.10. For any sequence of Gaussian fields (X„)„>i with state 
spaces Sn growing in size to infinity, we have 



Let us now present one example of a concrete variance bound that can be 
used to establish superconcentration. The following theorem is proved in 
Section [5l where it is deduced from a quantitative version of Theorem 11.101 
We shall use this result to establish the presence of chaos in certain models 
of spin glasses in Section [9l Moreover, the variance bound given by the 
following theorem, wherever it applies, gives 'corrections in powers of n' 
instead of log corrections as given by hypercontractivity. 

Theorem 1.11. Consider the field X = {Xi)i^s- Suppose R{i,i) = cr^ for 
all i. For each i,j G S, let rij := R{i, j)/a'^ . Let 



Then v < Ca^fS and for any t > 0, E(i?(/°,/*)) < ^^=t, where C is a 
universal constant. 

The bound can be interpreted easily by considering i.i.d. standard Gaussians. 
If rij = for all i / j and ra = 1 for all i, we have J2ijes |S'|"^/*^^+'"'j) < 2, 
which proves superconcentration (although not the correct order bound on 
the variance in this case). In general. Theorem 11.111 gives a way of proving 
superconcentration and chaos when 'most correlations are small', but may 
not work in many situations. 

The rest of the paper is organized as follows. In Section [21 we state some 
well-known results about Gaussian random variables that will be useful for 
us later on. In Section [31 we prove Theorem 11.81 together with a num- 
ber of results that give quantitative versions of the various implications in 
Theorem 11.81 In Section [H we give a brief introduction to the concept of 
hypercontractivity for the Ornstein-Uhlenbeck semigroup and how to use 
it for proving superconcentration. In Section [5] we prove Theorems 11.101 
and 11.111 Finally, in Sections [6] through \TT\ we work out a number of 
examples. This includes applications to eigenvectors of random matrices 
(Section [6]), the Kauffman-Levin NK fitness model of evolutionary biol- 
ogy (Section [7|), directed polymers in random environment (Section [8]), the 
generalized Sherrington-Kirkpatrick model of spin glasses (Section [Q]), the 
discrete Gaussian free field (Section [TOl) . and Gaussian fields on Euclidean 
spaces (Section [TT]). 



Extremality 



Chaos. 
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Let us now mention some conventions that we will follow in this paper. 
First of all, we must declare that the constant C is going to stand for any 
generic universal constant, whose value may change from line to line. This 
is an invaluable help in lightening notation. We will generally denote scalar 
variables and elements of by ordinary italic font variable names like 
X, y, u, etc. We will use boldface in dimensions higher than 2. Finally, let 
us reiterate that the symbols X, X*, M, m, u, cr^, I*, and R{i,j) will be used 
without reference to denote what they denote in this section. 

2. Some basic facts about Gaussian random variables 

In this section, as elsewhere, we continue to use the notation defined in 
Section [H In particular, X, X*, m, v, cr^, R{i,j), and /* stand for the same 
objects as before. We state some very well-known facts about Gaussian 
random variables and vectors that will be of repeated use for us in the rest 
of the manuscript. Two such facts, namely Proposition 11.21 and Proposition 
11.31 have already been stated in Section [TJ 

Size of the maximum. Just like the variance, the expected value of the 
maximum of a Gaussian field also has a general, worst case bound. The 
bound is much easier to establish than the variance bound, so we give the 
proof right here. 

Lemma 2.1. We have the general hound 

m < V2(j2 log|5|. 
Moreover if \S\ > 2 then for any p > 1, 

E|M|P < Emax\Xi\P < C(p)c7P(log 

i 

where C{p) is a constant that depends only on p. 

Remark. We do not need that (Xj)jg5 are Gaussian for the bound on the 
expectation; the proof goes through for any collection random variables with 
Gaussian tails, irrespective of the dependence among them. This observation 
will be used a few times in the sequel. 

Proof. Without loss of generality, assume that = 1. Then for any /? > 0, 
m = ^E(log e^*^) 

<il„ggE(e^-.)<f + i^. 

Optimizing over P, we establish the first claim. For the second, one just has 
to combine the bound on m with Proposition [T31 and observe that max 
is the maximum of the concatenation of the vectors X and —X. □ 
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Slepian's lemma and Sudakov minoration. The following result is an 
indispensable tool in the study of Gaussian processes. It was discovered by 
Slepian [68] and goes by the name of 'Slepian's lemma'. 

Lemma 2.2. Suppose X = (Xj)jgs and Y = {Yi)i(zs are centered Gaussian 
random vectors with E^Xf ) = E{Y^^) for each i and E{XiXj) > E{YiYj) for 
each Then for each x G M, 

P(maxXi > x) < P(maxyj > x). 

i i 

In particular, E(maxj Xi) < E(maxj Yi). 

The next result is a close analog of Slepian's inequality, known as the Su- 
dakov minoration lemma. For a proof, see Lemma 2.1.2 in [75j. 

Lemma 2.3. Suppose a is a constant such that E(Xj — Xj)'^ > a for all 
i, j £ S , i ^ j. Then 

m > Ca^/log\S\, 
where C is a positive universal constant. 

Mills ratio bounds. The following pair of inequalities is collectively known 
as the Mills ratio bounds. For a standard Gaussian random variable Z, for 
any x > 0, we have 

< P(Z > x) < 

\/2^(l + x2) xV2^ 

The proof is not difficult, and may be found in numerous standard texts on 
probability and statistics. The inequalities in the above form were probably 
first proven by Gordon [29]. Along similar lines, one can also prove the 
inequality 

(2) F{Z > x) < e-^'/2^ 
which follows simply by optimizing over 

Gaussian integration by parts. Suppose Z is a standard Gaussian ran- 
dom variable, and / : M ^ M is an absolutely continuous function. If 
E|/'(Z)| < oo, then one can argue that 

(3) E|Z/(Z)| <CE|/'(Z)|+C<oo, 

where C is a universal constant. Moreover, a standard application of inte- 
gration by parts gives the well-known identity 

E(Z/(Z)) = E(/'(Z)). 

This identity can be easily generalized to a Gaussian random vector like X, 
as follows. If / : M*^ ^ R is an absolutely continuous function such that 
||V/(X)|| has finite expectation, then for any i £ S, 

E(X,/(X)) = ^i?(i,j)E(a,/(X)), 

J6S 
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where djf denotes the partial derivative of / along the jth coordinate. This 
identity can be derived from the previous one simply by writing X as a linear 
transformation of a vector of i.i.d. standard Gaussian random variables. The 
author encountered this useful version of the integration-by-parts identity 
in [74], Appendix A. 6. 

3. Structure of a superconcentrated Gaussian field 

The goal of this section is to prove Theorem 11.81 Throughout this sec- 
tion, as everywhere else, we will freely use notation from Section [1] (like X*, 
R{i,j), m, V, P, and M) without explicit reference. We will divide the proof 
into a number of subsections, one devoted to each part of the proof. The 
theorems of this section are all interesting in their own right, because they 
give quantitative versions of the various implications of Theorem 11.81 

3.1. Superconcentration is equivalent to Chaos. We begin with the 
exact formula for v stated in Section [TJ 

Lemma 3.1. Let t be a standard exponential random variable, independent 
of everything else. Then we have 

(4) v = E{R{l',n). 

Note that by the Cauchy-Schwarz inequality, this identity implies Proposi- 
tion O 

Lemma l3.lt combined with an elementary Tauberian argument, leads to 
the following Theorem, which establishes the equivalence of superconcentra- 
tion and chaos. 

Theorem 3.2. For each t we have 

< E(i?(/°, /*)) < — and 
1 — e ' 

V < a\l - e"*) + E(i?(I°, /*))e-*. 
Moreover, K(R{I^,P)) is a decreasing function oft. 

To see how the bounds imply the claim of equivalence of superconcentration 
and chaos, consider the following. If, in the notation of Section [H we have 
Vn = o(c7^), then by choosing t„ = \/unJ(^ = o(l) we can guarantee by the 
first bound that 

E{RiIn,In"))<0{V^an) = o{al). 

Again, if we can find t„ = o(l) such that E(ii(/^, /^")) = o(cr^), then the 
second bound shows that Vn = o{a^). 

The proof of Lemma |3. II is done in three simple steps. 

Lemma 3.3. Let Y = (Yi)i^s be a vector of independent standard Gauss- 
ian random variables, and let Y' be an independent copy of Y . Let f : 
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M'^ be an absolutely continuous with gradient V/ and suppose that 

E||V/(Y)||2 < oo. For each te [0,oo), let Y* = e'*Y + Vl - e-2*Y'. Then 

POO 

(5) Var(/(Y))=/ e-*E(V/(Y), V/(Y*)> dt, 

Jo 

where {■,■) denotes the usual inner product. 
Proof. Note that 

Var(/(Y))=E(/(Y)(/(Y)-/(YO)) 
= e(- /(Y)^/(e-*Y + Vl - e-2*Y')(it) 

= E (- /(Y) g (-e-*y, + ^^1=5^) 9./(e-*Y + Vl^^Y')dt 

Now fix t G [0, oo), and let 

V* := Vl - e-2*Y - e-*Y'. 
Then Y* and V* are independent standard Gaussian random vectors and 



Y = e-*Y* + Vl - e-2*V*. 

Taking any i, and using Gaussian integration- by-parts as outhned in Sec- 
tion [2] (in going from the second to the third fine below) , we get 

E(/(Y)(e-*y, - -^^=y,f{e-'Y + v^T^^Y')) 

= -^=L==E(/(e-*Y* + Vl - e-2*V*)y/5,/(Y*)) 

= e-'E{dJ{Y)dJ{Y')). 

(Note that the condition E||V/(Y)|p < oo and the bound ([3]) allows us 
to integrate by parts and interchange integrals and expectations.) This 
completes the proof of the Lemma. □ 

Lemma 3.4. Let f be as in Lem.ma \3.3[ Then 

/•oo " 

Var(/(X)) = / e-* V i?(i, j)E(a,/(X)a,/(X*)) dt. 

Proof. If R = BB^ for some matrix B, then we can assume X = BY where 

Y is a standard Gaussian r.v. As before, let Y' be an independent copy of 

Y so that X' = BY' is an independent copy of X. Putting g{y) := f{By), 
and using Lemma 13.31 with g instead of /, an easy computation gives 

n n 

(6) ^a,5(Y)a,5(Y*) = R{i,mf{X)djf{-X'). 

By Lemma |3.3| this completes the proof. □ 
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Proof of Lemma \3.1[ Consider the function /(x) := maxig^Xi. We have 

= l{x,>x,Vj} a.e. 

The proof now follows easily from Lemma 13. 4[ □ 

The proof of Theorem 13.21 requires one more lemma. The current proof 
of the following result is a major simplification (thanks to Michel Ledoux) 
of the author's proof in the first draft. 

Lemma 3.5. Let Y and Y* be as in Lemma |3.3[ Then for any function 
h : W ^ R such that E{h{Yf) < oo, E(/i(Y)/i(Y*)) is a nonnegative, 
non-increasing function oft. 

Proof. Fix t >0. Let Y" be another independent copy of Y. Let 



Y-</2 g-V2Y + yiT^Y". 

It is easy to verify (by checking covariances) that the pair (Y, Y*) has the 
same distribution as the pair (Y~*/^, Y*/^). Again, it is trivial to see that 
given Y, the vectors Y~*/^ and Y*/^ are independent and identically dis- 
tributed. Thus, 



E{h{Y)h{Y*)) =K{h{Y~^/^)h{Y^/^)) 
=mPt/2HY)f), 

where 

Psh{Y) :=E(/i(Y") I Y). 

This shows that E(/i(Y)/i(Y*)) is nonnegative. Now, it is easy to verify that 
{Ps)s>o is a semigroup of operators, that is, PsPt = Ps+t- Using this, note 
that for any s,t > 0, 

E{{Pth{Y)f - iPt+sh{Y)f) = E{{Pth{Ynf - {E{Pth{Y') \ Y)f) 

= E(Var(Pj/i(Y") | Y)) > 0. 

Combined with the representation ([7]), this shows that E(/i(Y)/i(Y*)) is 
non- increasing in t. □ 

Proof of Theorem \3.2l By Lemma r3.5l and the representation ([6]), we see that 
K{R(I^, /*)) is nonnegative and non-increasing as a function of t. Combining 
this with Lemma l3.lt we get that for any t, 



e-'E{R{I°,L'))ds 

> [ e-'E{R{I^ , L'))ds 
Jo 

> [ e-"E(i?(/0,/*))ds = (l-e-*)E(i?(/0,/*)). 
Jo 
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Similarly, 

/>t />oo 

v<a^ e-"ds + E(i?(/°,/*)) / e'^ds 
Jo Jt 

= a2(l-e-*)+E(i?(/°,/*))e-*. 
This completes the proof. □ 

3.2. Strong Multiple Peaks implies Superconcentration. As usual, 
we give a quantitative version of this result. The difference with other parts 
of the proof is that this proof is not an original idea of the author; it follows 
from a sketch given at the end of Section 8.3 in Talagrand's famous treatise 
on concentration inequalities [72]. To the best of our knowledge, this sketch 
has never been formulated as a concrete theorem before. 

Theorem 3.6. Suppose I is a positive integer, and e > 0, 5 > 0, and 

7 G [0, 1] are numbers such that with probability 1 — 7, there exists a set 
A <^ S such that 

(a) 1^1 > /, 

(b) R{i,j) < e for all i, j £ A, i ^ j, and 

(c) Xi>M -6 for all i £ A. 

Then v < 36^ + Sl'^a^ + 3e + IGa"^^. 

It is easy to see from the above result how we get superconcentration if we 
have a sequence of Gaussian fields as in Section [1] satisfying the strong MP 
condition with Z„ — > 00, = 0(0"^), 5n = o{an) and 7^ — > 0. 

Proof. Let 

U := {{ii,i2, . . . ,ii) : ii,. . . ,ii G S, R{ip,iq) < e for all p / q}. 
For each (ii, . . . , i;) G U, let 

1 ' 

p=i 

A simple computation shows that 

Var(%,...,,^))<rV + e. 

Thus, if we let 

M := max Z/^^ \, 

then by Proposition 11.21 

(8) Var(M) < T + e. 

Now, let X' be an independent copy of X, and define M', Z\. . and m' 
accordingly. Let E denote the event of the existence of a set A satisfying 
(a), (b), and (c) for the vector X and a set N satisfying (a), (b), and (c) for 
the vector X'. If E happens, then \M-lA\<b and \M' - m'| < b. By the 
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inequality {x + y + z)^ < 3(x^ + 2/^ + -z^) and the Cauchy-Schwarz inequality 
in the third step below, we have 

2Var(M) = E(M - M'f 

= E((M - M'f; E) + E((M - M'f; E") 

< 6E((M - Mf; E) + 3E((M - M'f; E) 
+ \¥{E^)¥.{M - M'f]^'"^ 

< 6(5^ + 3E(M - M'f + (27 E(M - M'ff/'^. 

By ([8|), the second term is bounded by 6(/^^(7^ + e). By Proposition 11.31 

E(M - M'f < 16a^ / x^e-^'/^dx = 512a^ 
Jo 

This completes the proof. □ 

3.3. Under positivity, Chaos implies Multiple Peaks. The goal of this 
subsection is to prove that if R{i,j) > for all i, j, then the property of chaos 
guarantees the multiple peaks condition. As before, we have a quantitative 
statement. 

Theorem 3.7. Suppose R{i,j) > for all Then for any integer I > 2, 
any e £ (0,cj^), and any 6 G {0,m), we have that with probability at least 



vmP log I / va^l^ log l^^^^ 



there exists A <^ S such that 

(a) \A\ = I, 

(b) R{i,j) < e for all i, j £ A, i j, and 

(c) Xi>M -6 for all i G A. 

It is not immediately clear why this theorem shows that chaos (or equiv- 
alently, superconcentration) implies multiple peaks. Let us prove that im- 
plication before starting the proof of Theorem 13.71 Recall the sequence X„ 
from Section [1] and the associated quantities. Suppose Vn = 0(0"^). Let 

Put 

._ 1/3 2 . ._ 1/9 I ._ U-l/24n 

Then for sufficiently large n, a simple computation gives 



^VnmJllogln , / VnCF^l 



<aI/36 + «i/48^ 



1/4 



m 



n 
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Clearly, e„ = o((T„), 6n = o{mn), and oo. So the above bound implies 
multiple peaks as soon as we show that an/mrin remains bounded. More is 
true, though. Let us now prove that (T„ = o{mn). Let x+ denote the positive 
part of a real number x, and let 

ipix) ■.= E{Z -x)l, 

where Z ~ A^(0, 1). Clearly, ip is everywhere positive and continuous, 
and lim^^^oo fp{x) = 0. For each n, suppose i„ is a coordinate such that 
Var(X„_j„) = al- Since surely M„ > Xn,i„, we have 

Vn = E(M„ - m„)2 > E(M„ - m„)^ 

> E{Xn,i„ - mn)\ = alip{mn/crn)- 

Since Vn/cr'^ — > 0, this shows that mn/cTn — > oo, and completes our argument 
for chaos MP using Theorem 13.71 

Let us now give a sketch of the proof of Theorem 13.71 The idea is very 
simple: due to the chaos property, we can perturb the field a little bit to 
get a new maximum at a location 'far away' from the original maximum. 
However, since the perturbation is small and the value of the maximum is 
concentrated, it follows that the new location of the maximum must have 
been a near-maximal location in the unperturbed field. This shows the 
existence of at least two near-maximal points that are 'far away' from each 
other. Repeating the process as many times as we can, we get a large number 
of near-maxima. 

Proof of Theorem 13.71 Let 



^ / v5 log / 



Ime 



Since 5 < m, we see that the quantity Q is negative if ltm/5 > It > 1. 
Thus, we can assume without loss of generality that It < 1. Then for any 
1 < k < I, we have the crude bound 1 — e~'^* > kt/2. Thus, by Theorem l3.2^ 

(10) ±^M'))<±^^<±f^<?^. 

k=l k=l k=l 

Let X(1),...,XW be i.i.d. copies of X, and recursively define Z*^ Z"' 
as follows. Let Zi^^^ := X, and for each k, let 



For each k, let 

Lk ■■= argmaXjg^Z^ 

It is easy to see by induction that 'L^^'^ has the same distribution as X for 
every k, and 

Cov(z(^),z(*^)) = e-*l-'-^li?. 
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This shows that 

(z(j'),z('')) '^=' (x°,xi^'~''i*). 

In particular, 

E(i?(L,,Lfc))=E(i?(/0,/l^'-'=l*)). 
Thus, using ([10]) and the positivity of R{i,j) we get that 

P( max R{Lj,Lk) >e) <- ^ E{R{Lj,Lk)) 
(11) ' 'f'''' 

<iVE(i?(/M^*))<MM. 

Now fix some k < I. Note that Z^*^^ can be written as 



where Y is an independent copy of X. Thus, if we let 



W := Vl - e-2fctx - e-^*Y, 
then Z^^) and W are independent (because Cov(Z('^\ W) = 0), and 



X = e^^^Z^''^ + vT^e^W. 

Therefore, 

E|Xl^ -m\< E|e"''*zfj - m| + x/T^^e^ElWiJ 

< e-''^E\z'j^^ - m| + (1 - e-'=*)m + \/r^7^E|l^Lj 

< + A;tm + V 2kta. 

Note that we crucially used the independence of Z'^'^^ and W in the last 
line to conclude that E|Wlj.| < a. Prom the above bound and Markov's 
inequality, we get 



max \Xl^ -m\>l6)< ¥{\Xl, - m\ > 1^) 

l<k<l 

2{l^/l} + iHm + /3/V\/2t) 



^l<k<l 

- - l<k<l 



< 

6 

Again, applying Markov's inequality we have 

P(|M-m| > U) < 
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Combining the last two inequalities and the inequality (jlip . we get 

P( max R{Lj,Lk) > e or max IX^^, — m\ > ^6 or |M — m\ > ^6) 

2vl\ogl 2{l^ + Ptm + l^/^aV2i) 2^ 
- 7t ^ 6 ^ ~T~ 



6e 6 \ 6^ 



475 l^„/\ 1/4 



IGvmPlogl ^ {2l + 2)^/d ^ fAva'^lHogl 



"me 



Now, since e < and I > 2, the ratio of the second term to the third is 
bounded by 

/ GAvme \ ^^^ / 6Avm \^^^ / 16vml^ \ogl\^^^ 
\6aHlogl) - \6el log/ ) " V / ' 

which is precisely the square-root of the first term. Without loss of gener- 
ality, we can assume that the first term is < 1 (because we are bounding a 
probability), and therefore the third term dominates the second. 

The proof is finished by defining A := {Li, . . . ,Li}. Note that although 
the construction of the set A involved auxiliary randomness, the existence 
of a set like A depends just on the vector X, and hence the probability of 
existence is not affected when we expand the probability space. □ 

3.4. Multiple Peaks does not imply Chaos. In this subsection, we ex- 
hibit a sequence of Gaussian fields (Xji)„>i that satisfies the Multiple Peaks 
condition but is not chaotic. We will also arrange it so that the field has 
positive correlations between sites and the same variance at each site, just 
to show that these two factors don't play any role. 

For simplicity, we fix n and avoid putting it as a subscript. Let 

{9-j,l<i<n, je{0,l}, l<k<n) 

be a collection of i.i.d. standard Gaussian random variables. Let T be the 
set of all maps from {1, . . . , n} into {0, 1}. For each f ^ T and each k < n, 
define 



^ ■ In 



Finally, define the vector X = {X'i)f^T, k<n as 



Let {Zk)k<n be i.i.d. standard Gaussian random variables, and let 

p:=l- n~^/^. 

^ \pYf + ^/T^Zk if > L 
Then note that for each f,f'£T and k, k' < n, 

Var(4) = 1, R{{kJ),{k',f')) := Cov{X'},xf,) > 0. 



X^ 
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Thus, in our notation, we have 

= max VarfX, ) = 1. 

feT, k<n ^ 

We claim that X has multiple peaks but is not chaotic. First, we show the 
existence of multiple peaks. Observe that for any k < n, 



max Yf 
f ^ 



Er=i™ax{5rio,5ii} 



n 



fj. := Emax{fi(io,5ii}, 



Mu := maxX 
/ 



/' 



(12) 

Thus, if 
and 

then we have 

Note that 
In particular, 

m„ := E(M) > ^l^/n. 
By Proposition 11.31 for each k we have 

P(|Mfc - E(Mfc)| > 2fin^/^) < 2e-2^''"'^'. 

Thus, 



E(Mfe) 



maxMfc 
k 



{jiy/n if A; = 1, 
PIJLy/n if > 1. 

max XH =:M^ 



feT, l<k<n 



f 



2„l/3 



Now, 



(max|Mfc -E(Mfc)| > 2fin^/^) < 2ne"2^ 

k 



max |Mfc - f^^/n\ < max |Mfc - E{Mk)\ + max |E(Mfc) - ij.y/n\ 

1/6 



< max \Mk - E(Mfe)| + fin 

k 

Therefore, 

(13) P(max \Mk - > 3fin^/^) < 2ne-'^^'''^'\ 

k 

Now choose 

^ = {(l,/i),(2,/2),...,(n,/„)}, 

where is the / that maximizes X^. By (jlSp . if we take 7„ = 2ne~^'^^"^''^, 
then with probability > 1 — 7„, we have 

for all k, X)^=Mk>M - 

Also note that |^| = n and R{ii, fi), {j, fj)) = for all i / j. Thus, the 
multiple peaks condition holds with = n, en = 0, 6n = Qfin^^^, and 7„ as 
above. Clearly, In — > oo, e„ = o(o"^), 6n = o(m„), and 7„ 0. 
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Next, let us show that X is not chaotic. We accomplish this by showing 
the X is not superconcentrated. Since 



Therefore, 

Vn := Var(M) > E((M - E(M))^ Mi = M) 

= E(Mi - E(M))2 - E((Mi - E(M))2; Mi < M) 

> Var(Mi) - [E(Mi -E(M))¥(Mi < M)]^/^ 

From ()12p we see that Var(Mi) = Var(max{<^io, 511}). From Proposition ll.31 
it follows that E(Mi— E(Mi))^ < C, where C is a universal constant. Again, 
by an argument similar to Lemma 12.11 (using the Gaussian tail bounds for 
Ml, . . . ,M„ that we get by Proposition II. Sp . we can show that |E(Mi) — 
IE(M)| < Cylogn. Combining these observations and the exponentially 
decaying bound on P(Mi < M) obtained above, we see that Vn can be 
bounded below by a positive constant that does not depend on n. Since 
(7^ = 1, this completes the argument. 

3.5. Chaos does not imply Strong Multiple Peaks. In this section, we 
produce a sequence of Gaussian fields that is superconcentrated, but does 
not have multiple peaks in the strong sense. As in the previous section, we 
fix n and avoid putting it as a subscript. Let {gij)i<i,j<n be a collection of 
i.i.d. standard Gaussian random variables. Let be the set of all functions 
from {1, . . . ,n} into itself. For each f £ J^, let 



Proposition 11.31 implies that 

P(Mi < M) < P(Mi < p^/^ - { 



1/6 





R{fJ') := Cov{Xf,Xf,) = \^'--f(^-f'(^^\ e [0,1]. 




i<j<n 
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Since Mi, . . . ,Mn are i.i.d. and Var(Mj) < C/logn for some universal con- 
stant C (well-known result; see Proposition 14. 2p . therefore we have 

Var(M) < 

logn 

This shows that X is superconcentrated. Let us now show that multiple 
peaks are not present in the strong sense. We need the following simple 
lemma about the difference between the largest and second largest values in 
a realization of n i.i.d. standard Gaussian random variables. 

Lemma 3.8. Suppose Zi, . . . , are i.i.d. standard Gaussian random vari- 
ables. Let D denote the difference between the largest and second largest of 
these values. There is a constant p > such that for any n, 

V log nj 8 

Proof. Let M„ and denote the largest and second largest values among 
Zi, . . . , Zn- Let 

, /7r^ loglogn-hlog(47r) /— 

2v21ogn 

Let $(x) = ¥{Zi < x). Then clearly, for any — oo<y<x<oo, 

P(M- <x, Mn>y) = n(l - $(y))$(x)"-^ 

Using this and the Mills ratio bounds from Section [21 it is easy to show that 
for any — oo < u < v < oo, 

lim P(a„(M- - bn) < u, a„(M„ - 6„) > v) = e-''e-^~\ 

n— »oo 

From this, it can deduced that an{Mn — M~) converges in law to a distribu- 
tion that has a positive density on (0, oo) and no point masses. This proves 
the lemma. □ 

Let us now finish the proof of the nonexistence of strong multiple peaks in 
the field X. For each i, let M~ be the second largest value among gn, . . . , gin- 
Let Di := Mi — M~ . Let < L>(2) < • • • < D^^) denote the Di's arranged 
in increasing order. Let p be the constant from the above lemma. Then by 
the lemma and the weak law of large numbers we see that as n ^ oo, 



lim P' 

n— >oo \ n 



i:Di> ^ 



\/log n 

(Note that we should have written Dn^i instead of Di, but we will be slack 
about this kind of thing.) It is not difficult to see from here that 

An/ 4:] . 

(14) limPK;i^(,)>-^ =L 

Let /* denote the maximizing function, that is f*{i) = argmax^ gij. Suppose 
/, /' are two functions such that R{f,f') < 1/2. Then we must have that 
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mm{R{fJ*),R{f',n} < 3/4, because if R{f,n > 3/4 and RifJ*) > 
3/4, then R{f,f) > 1/2. Thus, 



(15) Xf* - mm{Xf,Xf} > 



n 



Let -E denote the fohowing event: For any f,f'£j^ such that R{f, f) < 1/2, 
at least one oi Xf and Xf is less than 



Xf* 



8 V log n 

Then from ()14p and (jl5p . we see that ¥{E) — > 1 as n — > oo. This shows 
that the Strong Multiple Peaks condition does not hold for the sequence of 
Gaussian fields under consideration. 



4. Hypercontractivity 

Suppose we have a semigroup of operators {Pt)t>o, acting on some space of 
functions on M.^ (semigroup means PtPs = Pt+s)- Suppose /x is an invariant 
measure for the semigroup, meaning that / Ptfdjj, = J fdfi for any / in the 
domain of Pt- The semigroup is said to be 'hypercontractive' if for any t > 0, 
there are numbers q > p > \ (possibly depending on t) such that Pt maps 
L'^in) into LP{fi) and moreover, ||Pf/||q < ||/||p for all / G L'^ifJ-)- Here || • ||p 
denotes the standard L^-norm on LP(/i). This phenomenon was discovered 
by Nelson [56j and established for the Ornstein-Uhlenbeck semigroup by him 
a few years later [57]. The 0-U semigroup on M" is defined as follows. Let 
Z be a standard Gaussian random vector in M". For any function / and 
t > 0, define the function Ptf as 

PJ(x) := E/(e-*x + ^/l-e'^^Z). 

It is easy to see that the standard Gaussian measure on M", which we denote 
by 'y®"^ is an invariant measure for this semigroup. Nelson [57j showed that 
for any p > 1 and f > 0, if we let g = 1 + e^*(p — 1) > p, then for all 

/ E L«(7®'^), we have 

(16) \\Ptf\U < WfWp- 

Note that the result does not depend on the dimension n at all. This is 
one of the remarkable properties that make hypercontractivity a deep and 
powerful tool. 

The study of hypercontractive semigroups was given a major boost by 
the discovery of the connection between hypercontractivity and logarith- 
mic Sobolev inequalities by Gross [31]. For surveys of the extensive lit- 
erature that developed around this topic, one can look in the wonderful 
monographs [4] and [35] . 

The connection between logarithmic Sobolev inequalities (and hence hy- 
percontractivity) and concentration of measure was discovered by I. Herbst 
in a small unpublished note (see Ledoux [47], Theorem 5.3). However, the 
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Herbst argument can only establish ordinary concentration of measure. The 
first application of hypercontractivity to prove what we call superconcentra- 
tion was due to Talagrand [Tl]. Talagrand's method was subsequently used 
by Benjamini, Kalai, and Schramm [7J to prove superconcentration in first 
passage percolation (they call it 'sublinear variance'). 

The way we use hypercontractivity in this paper to establish superconcen- 
tration is essentially the same in spirit as Talagrand's technique, although 
there is the difference that while Talagrand's method applies only to func- 
tions of independent random variables, we can deal with strong dependence 
as in the Gaussian free field. 

Recall that in our setting, we have a vector X = (Xj)jg5 that is centered 
Gaussian but not necessarily with independent components. We can still 
define a semigroup {Pt)t>o as 

PJ(x) := E/(e-*x+ Vl-e-2tX). 

Note that this semigroup retains the same hypercontractive property (jl6p as 
the standard 0-U semigroup, with exactly the same relation between p and 
q. This can be argued as follows. In our notation, R = [R{i, j))i j^s is the 
covariance matrix of X. Assuming that the vector X is not identically equal 
to zero, we know that for some d < \S\^ there is a IS"! x d matrix B of full rank 
such that BB"^ = R. We can assume that there is a d-dimensional standard 
Gaussian vector Z such that X = BTj. Let Z' be an independent copy of Z, 
and let X' = BZ' . Given / : M*^ ^ R, define the function 5 : R"' ^ M as 

5(x) :=/(i3x). 

The random vector X is supported on the image of M"^ under the map B. 
On this subspace the map B has an inverse, which we denote by B^^. Then 
we have 

PJ(x) = E/(e-*x + - e-2*X) 

= Ef{e-*BB-^x + Vl - e-2*5Z) 

= Eff(e"*B"^x + \/l - e-2*Z) = Ptg{B-^^). 
Therefore, we have that for every t > 0, 

Ptf{X) = Ptg{B~'X) = Ptg{Z), 

and so by ([H]), 

||PJ(X)||, = \\Ptg{Z)\U < \\g{Z% = \\g{X%. 

Thus, the hypercontractive property ([16|) holds for the semigroup {Pt)t>o 
with the same q,p. The following lemma, which is of crucial importance to 
us, is a direct consequence. Recall all notation from Section [H including the 
definition of X*. 

Lemma 4.1. For any measurable f : M'^ [0, 1] and any t > 0, we have 
E(/(X)/(X*)) < (E/(X))i+*^'^^(*/2)^ 
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Proof. Take any p > 1 and let g = 1 + e^*(p — 1). Let q' = q/{q — !)• Since 
/ maps into [0, 1], we have 

E(/(X)/(X*)) =E(/(X)PJ(X)) < ||/(X)||,,||PJ(X)||, 

<||/(X)||,,||/(X)||p 

<(E/(X))' 



7 + p. 



We now optimize over p, which yields 



p = l + e~*. 



An easy computation gives 



1 1 _ 1 1 1 

q' p 1 + e^*(p — 1) p 

= 1 + tanh(t/2). 

This completes the proof. □ 

Lemma 14.11 is a very useful tool, as we will see in our proof of supercon- 
centration in directed polymers in Section [8] and in the Kauffman-Levin NK 
model in Section [71 It is particularly potent in combination with Lemma [3.3l 
or Theorem 13. 11 To demonstrate an immediate application, let us work out 
the following simple result that we do not know how to prove without using 
hypercontractivity. It says that the variance of the maximum is small when- 
ever the correlations are uniformly small. Recall that v = Var(maxj Xi) and 
R{i,3) = Cow{Xi,Xj). 

Proposition 4.2. Suppose that R{i,j) < p for each i j and R{i,i) = 
for each i. Then 

V < - — 1^ + Cp, 
log 1 51 

where C is a universal constant. 

Remark. It can be easily shown that the bound is sharp by considering the 
case where R{i,j) = p for each i ^ j. However, in spite of the sharpness, 
the result is probably not very useful since the uniform boundedness is a 
very strong restriction. It is presented here only for illustration purposes. 

Proof. For any vector x G M'^ all of whose coordinates have distinct values, 
define 
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Then by Lemma 14.11 and the hypothesis of this theorem, we have 

E(i?(I°, /*)) < Cj¥(/° = /*) + /9P(/° /*) 

= a^Yl IE(/i(X°)/^(X*)) + pP(/0 / /*) 
ieS 

< a2(maxP(/° = i))tanh(t/2) ^p^^O = ^) + p 

ies 

= (72(maxP(/0 = i))t^°h(t/2) ^ 

Now, by increasing the value of the constant in the statement of the theorem 
(and recalhng Proposition ll.2p . we can assume without loss of generality that 
p < (T^/2. Under this assumption, by the Sudakov minoration technique 
stated in Section [21 we have 



m 



where we are using our convention of letting C denote any universal constant. 
Thus, by Proposition 11.31 and the tail bound ^ we have 

P(/0 = i) < F{Xi >m/2)+ P(M < m/2) 

< 26-""^/^"^ < \S\-^. 

Now, for each t > 0, 

pt/2 _ p-t/2 (-, _ p-t/2\(-\ I p-t/2\ 

Combining this with the bounds on E(i?(/^,/*)) and P(/'' = i) obtained 
above and Theorem 13. 11 we get 

oo 



e~*E{R{r,r))dt 
Jo 

/•oo 

Jo 

Jo logpl 



This completes the proof. □ 

A part of the proof of Proposition 14.21 generalizes to the following tech- 
nique, that will be help us bound the variance of the maximum of a discrete 
Gaussian free field in Section [lOl It will also be used for analyzing continuous 
Gaussian fields on Euclidean spaces in Section [TTl 



CHAOS, CONCENTRATION, AND MULTIPLE VALLEYS 27 

Theorem 4.3. Suppose that for each r >0, there is a covering C{r) of S 
such that whenever i,j are indices with R(i,j) > r, we have i,j G D for 
some D £ C{r). For each A O S let p{A) := P(I° G A), and define 

p{r) := max p{D), fj,{r) := ^ p{D) = E\{D £ C{r) : /° G D}\. 
Then we have 

„ < r Mrfc/M 

Jo -log p(r) 
where we interpret (1 — x)/\ogx = —1 when x = 1. 

Proof. By Lemma l4.ll we have 

P(i?(/°,I*) > r) < P(I° G /* G 

DeC{r) 
< ^ p(l))l+tanh{t/2) 
DeC(r) 

<p[rY^^Htm p{D) = fi{r)p{rY^''^^^/^\ 

DGC(r) 

Thus, by Lemma l3. II we have 



poo 

= / e-*E(i?(/°,/*)) (it 
Jo 

poo pa^ 

< / e-^F{R{I^,I^) >r) dr dt 
Jo Jo 

pu^ poo 

< / e-V(r)/o(r)*^°''(*/2) 
Jo Jo 



Now, by (|17p we have that for any fixed r, 



oo 



oo 







- log p{r) 

This completes the proof. □ 



5. Extremal fields 

The goal of this section is to prove Theorems II. 101 and II. Ill As usual, we 
prove quantitative versions. Recall the definitions of X, X*, M, m, f , cr^, 
and R{i,j) from Section [TJ The following theorem is the main result of this 
section. 
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Theorem 5.1. Let 

1/4 



m 

a : 



yj2cj^ \og\S\' V log|'5| J 

Then v < Ca^(5 and for any t > 0, E(i?(/°,/*)) < ^7==5f, where C is a 
universal constant. 

It is clear how Theorem 11.101 follows from this result. However, Theorem ll.lll 
still needs a proof. Let us prove it before moving on to prove Theorem 15.11 
As usual C will denote any universal constant, whose value may change from 
line to line. 

Proof of Theorem \l. 11[ Without loss of generality, assume that o"^ = 1. Let 
n = \S\, and define 

N := \{i : Xi > V21ogn}|. 

Then by the Mills ratio lower bound from Section [2] and the assumption that 
R{i,i) = 1 for all i, we have 

-logn (J 

E{N) > Cn ^ = ^ 

^J2\ogn ^J2\ogn 

Again, we have 

E(A^2) ^ ^ p(^^ > V21ogn, > V21ogn) 

< ¥{Xi + Xj > 2^2 logn) 



^ 41ogn \ 

Var(A-. + X,)j 



2 log n 



1 + R{i,j) 

Applying the Paley-Zygmund second moment method, we get 

(E(iV))2 ^ C 



»(M > V21ogn) = P(iV > 0) > > 



E(iV2) - ^.^^.^^n-2/(i+i?(ij))logn' 
Again, by Proposition 11.31 we know that for any x > 0, 

P(M-E(M) > x) < e"^'/^ 

Combining, we see that 

^ / \ 1/2 

E(M) > V21ogn - C ( log log n + log ^ ^-2/{i+i?(«,j)) J 

The proof now follows from Theorem 15.11 □ 
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Let US now give a brief idea of the proof of Theorem 15 . 1 1 b efore going into 
the technical details. The purpose of this sketch is to succinctly convey an 
idea that may have the potential to grow as an alternative to the hyper- 
contractive method for proving superconcentration. We begin by assuming 
that cr^ = 1. The first step is to show that for any t > 0, Xjt e~*m with 
high probability. We carry out this important step in Lemma [5. 2 [ Next, we 
fix t and let D = {i : Xj ~ e~*m}, where the meaning of ~ has to be made 
precise. A simple first moment bound shows that with high probability, 
l-^l ^ n^~'^ ■ Next we fix some r > and let B = {i : R{lQ,i) > r}. The 
key observation is that since X' and X are independent and cj^ = 1, we have 
Var(X' — rXjo \ X) < 1 — for each i ^ B, and hence 

E( max X' I X) = E( max (X[ - rXCo) I X) < ^(1 - r2)21og 

Combining this with the bound \D\ < n^~^ we get 

max X* < e~* max X + v 1 — e"^* max X', 

< e~2*a\/21ogn + (1 - e~2*)V2(l - r2)logn. 

Thus, if Vl — < a, or in other words r > \/l — a^, we cannot have 
that maxjgBnD « a^/2logn. If this does not happen, then P ^ B Ci D. 
But we have already stated that P € D with high probability. Therefore, 
whenever r > y/l — a^, we have that I* G D\B with high probability, which 
implies that R{I^,P) < r with high probability. Roughly, this justifies the 
\/l — a term in the statement of the theorem. The second term arises from 
our attempts at making the above sketch rigorous, which is a somewhat 
technically involved task. 

Let us now begin the formal proof. The first step, as mentioned before, 
is to show that Xp « e~*m. This is made precise in the following lemma. 

Lemma 5.2. Take any t > 0. Then for any x > 0, we have 

¥{\Xit - e-*m| > x) < 4e-^'/^'"'. 

Proof. For notational convenience, let a = e~*, h = VT^-"e~2*, Z := X*, and 
W := 6X — aX'. Then Z and W are independent, and 

(18) X = aZ + 5W. 

Since a + b < \/2, by Proposition 11.31 and the independence of Z and W, we 
have 

P(|X/t - am\ >x)< P(a|Z/t - m| > ax/V2)+'S'{h\Wit \ > bx/V2) 

< 2e-^'/^'^' +2e-^'/^'^'. 

This completes the proof. □ 

Proof of Theorem 15.11 Without loss of generality, we assume that 0-2 = 1. 
As in Lemma 15.21 let a = e^*, b = ^Jl — e~'^^, and Z = X*. Let n = \S\. 
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Note that since \R{i,j)\ < 1 for all i,j and C can be chosen as large as we 
like, it suffices to prove the theorem assuming that n is larger than some 
fixed threshold and that 

(19) 1/ loglogn y/^^ 1 

^ ^ 6 Va^logn J - 100' 

For the same reason, there is no loss of generality in assuming that m > 2. 
With that assumption, define 



. ^ . A/log(mV2) , , 

20 5:=-^^^ 0,1, 

m 

and let 

D := {i : \Xi — am\ < 6m}. 

By Lemma 15.2^ 

(21) P(/* ^D)< 4e-^'™'/^ 

Now, if a > 5, then by the Mills ratio upper bound from Section [2l 
IE|^| < ^^{Xi > {a-5)m) 

i 

Therefore, if we define 

il- {a-6)^a^ + 6^ if a > 5, 



c 



1 if a < (5, 



then in all cases we have 

(22) F{\D\ > n<) < n-^\ 

Define 

_ a(b^ - 2^) 

It is easy to verify using (jl9p that 6^ > 25 and hence 7 > 0. Again, 
a(b^ - 26) ^ b"^ -26 _ b'^ -26 ^ ^ 
" bVT^ ~ b^ - 



Thus, 7 G [0, 1]. Let r := y^l — 7^, and define the random set 

B -.= {1: R{I^,i) > r}. 

Note that 

(23) P(i?(/°, I*) > r) = P(/* £ B)< P(/* £ BnD)+ P(/* D). 

Let E'' and Var'' denote the conditional expectation and variance given X. 
Since R{i,i) < 1 and R{I^,i) > r for all i £ B and X' is independent of X, 
we have 

Var°(X,' - rX'jo) < 1 + - 2r^ = -f^ . 
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Again, E°(X^o) = 0. Thus, if B n D / 0, then by Lemma O we have 
E°( max X') = E°( max (X' - rXL)) 



< jy/2log\B r\D\ < 7v^21og|L>|. 

Combined with Proposition 11.31 this imphes that if B f] D ^ 9, then for all 
x>0, 



(24) P°( max X' > 7\/21og|L»| + x) < e""^ 

ieBnD ^ 

Clearly, the inequality holds even if we relax the condition B f] D ^ 9 to 
just 7^ 0, interpreting the maximum of an empty set as — oo. Since 
Xi < {a + 5)m for i £ D and Z = aX + 6X', we have 

max Zi < a max Xi + h max X' 

ieBnD ieBnD ieBnD 

< a(a + 6)m + b max X'. 

ieBnD 

Thus, from ()24p . we see that whenever D ^ 9, 
(25) 

Putting 



(25) P°( max > a{a + 6)m + 67^2 log + bx) < e"^ 
ieBnD 



m' := a{a + 6)m + b^^/2Q\ogn 

a[a + 0) H m 

a J 

= (a^ + a6 + b'^ - 25)m = (1 + a<5 - 26)m, 
and using and we get 

P( max Zi > m' + 6x) < E(P°( max Zi > m' + bx);l < \D\ < n^) 

ieBnD ieBnD 

(26) + E(P°( max > m' + 6x); IZ?! > n^) 
^ ^ ieBnD 

Now 

(27) m — m' = (2 — a)5m > 5m. 

In particular, m' < m. Let x := {m — m')/2b. Then by (j26p and Proposition 
11.31 we have 

P(/* G S n D) < P( max Zi> m + bx) + P(max Zi < m - bx) 
ieBnD ' les 

(28) < e-'/2 + n--^' + e"^'"'/^ 

<2e-("')'/8 + n-^'. 

Thus, by (gll), (El, dS]), and (gS]), we have 

(29) P(i?(/°, /*) > r) < Ge-"^"""/^ + n""^" < 76"^'"^'/^ 
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Now, if a > 6, then 

2 62^-a2(54_4^2^^4j2) 

r = 

_ b\l - a^a^ + 2a6a^ - + 5^) - a^b^ - AhH + U^) 

b'^C 

(1 - a^)6^ + 2b^a6a'^ + b'^6^ + 40^6^^ 
64 

1 - a2 + C(5 



< 



^ i,2 ■ 

where C denotes a universal constant (whose value may change from line to 
line). Again, if a < 5, 

2 _ 52-a2(b4_452^^4j2) 

< 1 - 0^62 + Aa^5 

= 1 - + a^a^ + < 1 - + 56. 
Therefore in all cases we have 

E(i2(/°, /*)) < r + P(i?(/°, /*) > r) 



62 

From the definition (I20p of we have 



^1/2 

Since + y < \fx + -^/y and a < 1, we get 

Now, m-^/2 = Q!i/2(2 log n)^/4_ Since R{i,j) < 1 for all we can put a 
large enough constant in front of the first term to remove the a^^'^ from 
the denominator of the second term. Finally, the bound on v comes from 
Lemma |3. II and our bound on K{R{I^,P)). This completes the proof. □ 

6. Example: Chaotic nature of the first eigenvector 

A random Hermitian matrix A = (ajj)i<ij<n is said to belong to the 
Gaussian Unitary Ensemble (GUE) if (i) {aij)i<i<j<n are independent ran- 
dom variables, (ii) the diagonal entries are standard real Gaussian random 
variables, and (iii) {aij)i<i<^j<n are standard complex Gaussian random vari- 
ables (i.e. real and imaginary parts are i.i.d. A^(0, 1/2)). 

Eigenvalues of GUE matrices are among the most widely studied objects 
in random matrix theory. For a general introduction to the classical random 
matrix ensembles and results, we refer to the book by Mehta [H]. The study 
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of the largest eigenvalue was revolutionized through the work of Tracy and 
Widom [781 [79l [80] . One of the striking implications of their work is that the 
largest eigenvalue has variance of order n~^/^, beating the 0(1) bound given 
by standard isoperimetric and martingale methods. But the Tracy- Widom 
result is in the sense of weak convergence, and does not provide an actual 
bound on the variance that we need. A variance bound of order n~^/^ was 
proved by Ledoux [48J and independently by Aubrun [5j. 

The eigenvectors of GUE matrices, taken as rows (or columns) of a ma- 
trix, give rise to a Haar-distributed unitary matrix. In that sense, they 
are quite well-understood. However, the behavior of the eigenvectors un- 
der perturbations of the matrix has not been studied. Such questions arise, 
for instance, in the study of chaos in the spherical Sherrington-Kirkpatrick 
model of spin glasses. For the definition of this model and further references, 
let us refer to the recent paper of Talagrand and Panchenko [59], where it 
was proved that the model is chaotic with respect to an external field. The 
goal of this section is to show that the first eigenvector is unstable under 
small perturbations of the matrix, and give a quantitative version of this 
statement. In the spherical SK model with complex spins, this establishes 
chaos with respect to the disorder at zero temperature. 

Let us now formulate the question in terms of Gaussian fields. For each 
vector u in the unit sphere 5"^""^ of C", define the quadratic form 

n 

Xu := u*ylu = aijUiUj, 

where, as usual, Uj is the complex conjugate of Uj and u* is the adjoint (i.e. 
conjugate transposed) of u. Since A is Hermitian, Xu is a real Gaussian 
random variable. 

Now, if V = zu for some z in the unit sphere U{\) of C, then = X^. 
Therefore to retain identifiability, we define Xu for each u not in S"^"^^, but 
in the complex projective space CP"""*^ = S'^"'"^/C/(l). However, we will 
continue to write elements of CP"~^ as if they were elements of C", with 
the quotienting being implicit. With that convention, let 

Ai := maXugcP"-i-'^u, ui := argmax^gcpn-i X^. 

Then Ai is the largest eigenvalue of the GUE matrix A and ui is the corre- 
sponding unit eigenvector. Our objective in this section is to show that ui 
is chaotic under small perturbations of A. 

Here we must remark that ui is almost surely well-defined in CP"~^. 
This follows from the fact that the eigenvalues all have multiplicity 1 almost 
surely, which can be deduced, for instance, from the well-known joint density 
of the eigenvalues of GUE (see e.g. Mehta [51], Chapter 3). 

Now let A' be an independent copy of A^ and as usual define the perturbed 
matrix A^ := e~^A + \/l — e~'^*A'. Let u* be the first eigenvector of j4*. We 
want to show that |(ui, u^)| := | ^^^i ui^iu\^j^\ tends to decay rapidly with t. 
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Note that there is no ambiguity because for any u, v E CP" ^, |(u, v)| is 
well-defined (although (u, v) is not). 

Theorem 6.1. There is a universal constant C such that for any t > 0, 

E|(ui,u*)|2< ^ 



(l-e-*)ni/3' 

where u* is the first eigenvector of the perturbed matrix A* defined above. 

Remarks. This shows that whenever t ^ n^^^^, the vectors ui and u* are 
almost orthogonal with high probability. As mentioned before, this result 
also proves that the ground state of the (complex) spherical Sherrington- 
Kirkpatrick model is chaotic under small perturbations of the disorder. 

Proof. An easy computation gives that for any u, v G CP"~^, 

Cov(Xu,Xv) = KU,V)|2, 

where one should note that the right hand side is well-defined on CP"""^. It is 
known from random matrix theory [48','5j that Var(Ai) < Cn~^/^. Therefore 
above formula for the covariance and Theorem 13.21 seem to imply that the 
proof is done. However, we have to be a little careful because Theorem 13.21 
works only for Gaussian fields on finite index sets. But this can be easily 
taken care of by taking the Gaussian field (^u) restricted to finer and finer 
nets of points in CP"^^ and using the uniqueness of the maximizer and 
continuity to pass to the limit. □ 



7. Example: Multiple global maxima in the NK fitness 

landscape 

Kauffman and Levin [13j introduced a class of models for the evolution of 
hereditary systems, which has since become one of the most popular models 
in evolutionary biology and some other areas. They named it the NK model 
because there are two parameters, N and K. The model envisions a genome 
as consisting on genes, each of which exists as one of two possible alleles. 
The fitness score of an allele at a given site is determined by the alleles of 
the K neighboring sites. Other than that, the fitnesses are as simple as 
possible, namely i.i.d., and the fitnesses of different sites are averaged to get 
the overall fitness of a genome. 

Let us define things formally. The space of all genomes is {0, 1}^. Let 

Y{i; rj), 0<i<N-l, G {0, 1}^+^ 

be a collection of i.i.d. random variables, assumed to be standard Gaussian 
for our purposes. Given cr € {0, 1}^, define the 'fitness' of <t as 

N-l 

F{<^) '■= ^ ^(^; (o-j,crj+l, • • . ,cri+x)), 
i=0 
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where the subscripts of a are wrapped around, i.e. gn+i = CTj. The function 
F : {0, 1}^ ^ M is cahed the 'fitness landscape', and the main objects of 
interest are the local and global maxima of this landscape. 

The NK model has been extensively studied, but very little of it is rig- 
orous. The first rigorous paper on the model, to the best of our knowledge, 
was due to Evans and Steinsaltz [24j . These authors used elegant arguments 
involving max-plus algebras to carry out computations about the global and 
local maxima of the NK model when K is fixed and N ^ oo. One can also 
find in [24j a beautifully written introduction to the history and motivation 
behind the model. Further rigorous results were derived using different tools 
by Durrett and Limic [23] who proved, among other things, a central limit 
theorem for the maximum fitness when K is fixed and — > oo. Some results 
about the local maxima in the case where and K both tend to infinity 
were obtained by Limic and Pemantle [49J. 

Our goal is to show that when and K both tend to infinity, the fitness 
landscape has many global near-maxima, which are 'far away' from each 
other. In other words, there are many nearly globally fittest genomes that 
are drastically different from each other. To formalize this statement, we 
first need define what is meant by 'far away' in the space of genomes. The 
NK model naturally defines the following measure of proximity between 
two genomes cr and cr': 

(30) pN,K{o-,cr') := \{i : {ai, . . .,(Ti+K) = (cr-, . . ■,cr'i^K)}\- 

This is a natural definition because 

PiV,/r(cr,cr') = Cov(F(cr),F(cr')). 

Note also that if K is large, then a and cr' may be far apart even if the 
Hamming distance between cr and cr' is relatively small. 

To understand the nature of the global maximum, we first have to have an 
idea about its size. It was shown by Evans and Steinsaltz [24] that the size 
of global maximum grows linearly in N when K is fixed (in an asymptotic 
sense). Following their argument, one can further deduce the surprising fact 
that when divided by N the expected size of the global maximum can be 
bounded above and below by universal constants that do not depend on K. 

Lemma 7.1. Irrespective of the value of K , we have 

N , 

< E(maxF(cr)) < Ny^2hg2. 

a/tT cr 

Remark. In fact, the bounds are sharp. The lower bound is achieved when 
K = 0, and the upper bound is achieved (asymptotically) when K = N — 1. 
Moreover, as we will see in the proof, the expected value is an increasing 
function of K. 

Proof. The upper bound is straightforward from Lemma |2. 11 For the lower 
bound, first observe that if G{a) is another measure of fitness, corresponding 
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to the NK model with X = 0, then for every cr^a' , we have 

Cov(G(cr),G(cr')) > Cov(F(cr), F(cr')), 

since it is quite clear that pm,k{<^-, c') is a decreasing function of K. More- 
over Var(F(<T)) = Var(G(<T)) = for every a. Therefore by Slepian's 
lemma, 

E(maxF(cr)) > E(maxG(cr)). 

cr <y 

Now, if (Z(i, ry))o<j<iv_i^ j/e{o,i} ^^^^ the i.i.d. Gaussian fitness scores used to 
define G, then 

maxG(cr) = max{Z(i, 0), 1)}. 

It is easy to compute that the expected value of the maximum of two inde- 
pendent standard Gaussian random variables is 7r~^/^. This completes the 
proof. □ 

The following theorem is the main result of this section. It shows that 
when N and K both tend to infinity, the fitness landscape exhibits the 
multiple peaks property. We first establish superconcentration using hyper- 
contractivity, and then use Theorem 13.71 to deduce the existence of multiple 
peaks. 

Theorem 7.2. Suppose N > K > 1. Then Var(maXo. F(cr)) < CN/K, 
where C is a universal constant. As a consequence, there is another universal 
constant C such that for any a > 1, with probability at least 

1 - cv^/^Tb^ 

there exists A C {0, 1}^ such that 

(a) 1^1 > 

(b) pTv^i^ (cr, cr') < a~^^^N for all cr,cr' £ A, cr ^ cr' , where Pn,k is the 
measure (130p of proximity between genomes, and 

(c) for all a £ A, 

F(a) > max F(cr') . 

(T'e{o,i}^ K 

Remarks. By Lemma 17.11 and the concentration of max F(cr), we know 
that max ^(cr) is of order N. This shows that whenever K is large, there 
are many near-maximal configurations that are 'far apart' in the sense of 
the proximity measure pn,k, since pn,k ranges between and N. The 
quantification of 'many' and 'far apart' depends on our choice of a. The 
theorem shows that any large a works, as long as a <^ K . Of course, the 
theorem in its present form does not have any relevance for realistic values 
of N and K, but we believe that there is room for improvement to an extent 
that can be of practical significance. Finally, note that the theorem proves 
strong multiple peaks when K grows faster than \/iV- 
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Proof. Let M = maxa- F{a). Let & be the maximizing configuration. With 
obvious notation, we have 

QY[v r]) {('^^'■■■''^i+K)=ivi,---,VK+i)}- 
Therefore by Lemma [331 Lemma l4.lt ^iid symmetry, we have 
Var(M) 

/•CO 

= / ■.^»+Jf)={'?iv.^x+i)}^I{(a*,...,a*+^)={r?i,...,r,x+i)})'^* 
" / ■ „\ 



/•oo 

< / e-*^[P((<7„...,a,+K) = (r?i,...,r/K+i))] 

Jo r \ 

-I 

Jo 



l+tanh(t/2)^^ 



oo 







K ' 

where C is a universal constant. Now let m = E(M) and v = Var(M). From 
Lemma I7.H we know that N/ ^JH < m < Nyj2 log 2, and from the above 
computation we know that v < CN/K. Let = max^- Var(F(cr)) = N. 
Take any a > 1, and let / = [a^/^^] + l, e = a-^/^iV, and 6 = aN/K. Then 



5e 



where C is a universal constant. The second term is dominated by the first. 
The result now follows from Theorem 13.71 □ 



8. Example: Chaos in directed polymers 

We introduced directed polymers in Section [TJ Let us now refresh the 
reader's memory by defining it once again. 

The (l+l)-dimensional directed polymer model in Gaussian environment 
is defined as follows. Let denote the set of all pairs of non-negative 
integers, with the lattice graph structure. Let g = {gv)vi^i?^ be a collection 
of i.i.d. standard Gaussian random variables. A directed polymer of length 
71 is a sequence of n adjacent points, beginning at the origin, such that each 
successive point is either to the right or above the previous point. Thus, 
there are a total of 2"~^ directed polymers of length n. Let Vn denote this 
set. An element of Vn will generally be denoted by p = (fo, . . . , Vn-i)-, where 
vq = 0. The energy of a polymer p = (vq, . . . , Vn^i) in the environment g is 
defined as 

n— 1 

E{p) ■.= -Y,9v.- 

1=0 
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For each p G Vn-, let Xp := —E{p) denote the 'weight' of the path, and 
let X = (Xp)pg-p^ . Our object of interest is the polymer with the minimum 
energy (i.e. maximum weight), usually called the ground state of the system. 
Suppressing the subscript n, we simply denote the minimum energy path by 
P and its energy by — M, that is, 

M := maxXr, = Xp. 

Note that for any 

Cov(Xp,XpO = \pr\p'l 

where |p H p'| denotes the number of vertices common to p and p' . 

If g' is an independent copy of g, and we define the perturbed environment 
g* := e~*g + \/l — e~^*g', then the energies of the paths defined in the 
environment g* correspond to our usual definition of X*. Let P* denote 
the maximum weight path in the environment g* (henceforth, simply the 
'maximal path'). The main result of this section is the following. 

Theorem 8.1. For some universal constant C , we have 

Cn 

Var(M) < 



logn 

Consequently, for any t > 0, 

Cn 



EPnP' < 



(1 — e *) log n 

It is easy to see how Theorem 11.11 follows from this result. Another remark 
is that the superconcentration of the ground state energy has implications 
about the fluctuations of the polymer shape and specially the end point 
of the minimum energy polymer; see Wehr and Aizenman [82] for further 
insights. 

One may object that the perturbation described above is not a small per- 
turbation at all, because we are perturbing all coordinates, and moreover 
independently. Indeed, the regular notion of perturbation in noise-sensitivity 
theory involves choosing a small fraction of coordinates at random and re- 
placing the weights with independent copies. However, the two notions lead 
to the same results in practice, because in one case we are giving large per- 
turbations to a small collection of coordinates, while in the other case (i.e. 
our case) we are giving small perturbations to all coordinates. One can ver- 
ify that with correct calibration, the two perturbations have similar effects 
on almost every conceivable summary statistic. 

Secondly, our definition of perturbation of a Gaussian field is the one 
favored by the physicists, who see it as an Ornstein-Uhlenbeck flow over a 
small period of time. 

The chaos property for directed polymers was first argued heuristically 
by Huse, Henley, and Fisher [38]. It was subsequently studied numerically 
and theoretically by Zhang [83] and Mezard [52]. A detailed theoretical 
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study was done by Fisher and Huse [25] and later, another one by Hwa and 

Fisher [39j. For a more recent work, see da Silveira and Bouchaud [66j. 
However, none of these papers give rigorous proofs, or even proofs that can 
be made rigorous with existing technology. To the best of our knowledge, 
no such proof exists. 

The scheme of the proof of Theorem 18.11 is straightforward given the tools 
we have. We apply Talagrand's technique of bounding variances using hy- 
percontractivity, and use the resulting superconcentration bound to infer 
chaos via Theorem 11.81 (and the actual bound via Theorem 13. 2p . However, 
on our way to applying hypercontractivity, we run into the difficulty that the 
probability of the optimal path passing through a given vertex is completely 
unknown; even a reasonable upper bound is unknown. This problem is over- 
come by a brilliant trick from a paper of Benjamini, Kalai, and Schramm [7J , 
where a similar variance-bounding problem for first passage percolation was 
tackled. But serious technical issues remain even after this, because adapt- 
ing the BKS trick requires a lot more effort in this case compared to the case 
handled in [7], because of the restriction that the paths have to be directed. 
In fact, this is the sole reason why the proof in its present form does not 
generalize to dimensions higher than 2, even though the BKS result for first 
passage percolation holds in all dimensions. 

We should mention here that there exists a large amount of rigorous math- 
ematics on directed polymers, even if there is none on the chaos problem. 
The rigorous study was probably initiated by Imbrie and Spencer [IQ], and 
subsequently carried forward by many authors (e.g. [I0],[67], [3], [69], [S], 
|14j , [15] , [21] , |20j ) . A very notable contribution was due to Johansson [¥T] , 
who found a miraculous way to do exact computations in the model with 
Geometric vertex weights instead of Gaussian, showing that M has fluc- 
tuations of order n^/^, and moreover a Tracy- Widom limiting distribution 
upon proper centering and scaling. The result was extended to binary edge 
weights in Johansson [32], Section 5.1 (see also [30]). The recent work of 
Gator and Groeneboom [16] implies a probabilistic proof of the n^/^ fluctu- 
ations, but again for Geometric vertex weights. 

We need some preparation before embarking on the proof of Theorem 
18. 1[ First of all, let us remind the reader of our convention that C denotes 
any positive universal constant whose value may change from line to line. 
This convention will be repeatedly invoked in the remainder of this section. 
Next, for each w G Z^, define the translated environment = {gw,v)vez'^ 
as gw,v '■= gw+v Now fix n, and define X^, X^, P^, P^, and as the 
analogs of X, X*, P, P*, and Af for the environment g^. Next, let 



(31) 



B:={{zJ) 



I <i,j <[n 



and let 
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Our first lemma is the following. 
Lemma 8.2. We have 



Var(M) < 



logn 

Proof. With obvious notation, we have 

^ ' ' weB ^ ' ' uiG-B 

Therefore by Lemma 13.31 and Lemma l4.ll we have 
Var(M) = r e-'Y.¥\(^Yl ho-n,^P^}) (r^ E h--'^Pi}) 



^6Z2_ ^' ' weB ^ ^' ' weB 

l+tanh(t/2) 

2 '- ^' ' U)G-B 



Now fix any v G Then 



^ ' ' "oG-B ^ ' ' wGB 

^ P(t; - E P) = 5] I{.-»eP}) • 



Since P is an directed path and B is an [n^/^] x [n-^/^] square, we certainly 
have 

II{.-^GP} < 2[ni/^] - L 

w^B 

Combining the last three observations, we have that 



Var(M)< rCe-*n-^-'^(*/2)E(^ Y.\^-^<^P}] 



(it 



w6-Bt,GZi 

Cn 

10 logn' 



/•oo 



This completes the proof of the lemma. □ 

Let us now introduce some further notation. For any v G Z^, let \v\ 
denote the sum of its coordinates. For any u, v G Z^, we write u < v ii 
V — u ^ Zi. For any such u, v, let 



Mu^y := max/ : p is a directed path from u to v}. 

uGp\{v} 
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For any u G Z^, n > 1, let 

:= max< ^^^u : p is a directed path of length n, starting at n >. 

Let Pu^v and denote the maximizing paths. 

Clearly, the distribution of M" depends only on n, and the distribution 
of Mu^y depends only on v — u. Let us now prove some inequalities for the 
expected values of these quantities. In the following can either denote the 
real number 0, or the point (0, 0) in Z^, depending on the context. 

Lemma 8.3. For any 1 < m < n, we have 

E(Mo") > E(Mo"') + C{n - m). 

Proof. Clearly, it suffices to prove when n — Tn-\-l. Let (^vq, ■ ■ ■ , Vm—i ) denote 
the maximal path of length m. Let u and w be the vertices immediately to 
the right and above Vm-i- Since the definition of the maximal path of length 
Tn does not involve vertices v with \v\ ^ ttz, it follows that (^Qu^ Qw) is stiU a 
pair of i.i.d. standard Gaussian random variables, even though u and w are 
random. To complete the proof, note that M^"*"^ > Mg" + max{5r„, g^;}. □ 

Lemma 8.4. For any v = G Z^, we have 

^Mo^v) < C\v\ 




where H : [0,1] ^ M. is the function H{a) = —a log a — (1 — a) log(l — a). 
Proof. Note that the total number of directed paths from to u is 

^yX _|_ ^J/N 

For any a G [0, 1], we have 

(v'' + vy\ , 
1 > a (1-a) 



V 

Taking a = /{v^ + v^), we see that 
fy-e + yy\ 

log ( ^ j < -f^loga - v^log(l - a) = /f(a)|?;|. 

The result now follows by Lemma |2.1[ □ 

Lemma 8.5. Let P denote the maximal path of length n. Let v = {v^jV^) 
be a vertex with \v\ < n — 1. There is a universal constant c G (0, 1) such 
that if mm{v^ jV^} < c\v\, then 

F{v G P) < 2e-^l''l'/". 
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Proof. Let P denote the directed path of length n, originating from and 
passing through v, that maximizes the sum of weights among ah such paths. 
Note that this is just the concatenation of the paths Pq^v and Py Let 



M = YliveP^^' ^^^"^ "aole that by Lemma [831 

E(M) = E(Mo^„) +E(M;'-I''I) 

= E(Mo-.„)+E(Mo"-l''l) 
<E(Mo^.)+E(M)-C|t;|, 

where C is the constant from Lemma 18.31 For the rest of this proof, C will 
denote this constant. From the above bound, we see that if E(Mo^i,) < C\v\, 
then by Proposition 11.31 we have 

W{y £P)= P(P = P) < P(M = M) 

E(M)-E(M) 
< P M > E(M) + — ^ — ^ ^ — 



+ p(m<e(m)-^M^) 

<2exp^ (CH-E(Mo^.))^ 



8n 

Now, the function H in Lemma 18.41 satisfies H{a) = H{1 — a) and 

lim H{a) = lim H{a) = 0. 

a—*0 a~*l 

Therefore, there exists a constant c G (0, 1) such that if mm{v^ ,vy} < c\v\, 
then E(Mo^t,) < C\v\/2, where C is the universal constant in Lemma 18.31 
This proves the current lemma by choosing c sufficiently small. □ 

Lemma 8.6. Denote the vertices of the maximal path P by (vq, . . . , Vn~i)- 
Extend the path P to an infinite directed path P' by adding a sequence of 
points Vn,Vn+i, ■ ■ ■, where each Vi is the point immediately to the right of 
Vi^i if i is odd, and immediately above Vi-i if i is even. Take any w G 
_l_\{0} with \w\ < c{n — 1), where c is the constant from Lemma 18.51 Let 



Z2 

L := min{i : Vi > w}. Then for any I < n — 1 such that \w\ < cl, we have 

P(L = /) < 26-^=''/". 

We also have¥{L > n) < 2\w\e~'^^"'~^^'^ . As a consequence of these bounds, 
we have E(L) < C(|^«| + ^/n) for some universal constant C . 

Proof. Let k := \ w\. It is easy to see that the definition of L ensures that 

(32) k<L<n + k-l. 
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Take any integer I £ [k,n — 1]. Since \vl\ = L, we have 

P(L = /) = f>(vL = {w'', I - w"") or VL = {1- w\ w^)) 

< ¥{{w\ I - w^) £P)+ F((/ - w^, njy) £ P). 

The proof of the first bound can now be completed by applying Lemma [8. 51 
For the second, note that again by Lemma 18.51 

P(L >n)<Y^ P((i, n-l-i) e P)+ ^ P((n e P) 

4 = 1=0 

< 2|u;|e-^("-^)'/". 

Together with (|32p . this completes the proof. □ 

Lemma 8.7. Let 7^„,fc be the set of all directed paths whose end points 
u = {u^,uy) and v = {v^ ^v^) satisfy \u\ < 2n, \v\ < 2n, and 

(33) mm{\u^' -v''\,\uy -v^l} <k. 

Let 

\J2vep9v\ 
Rn,k ■■= max = — , 

where \p\ denotes the number of vertices in p. Then E(i?^^) < Cfclogn. 
Proof. We wish to use Lemma l2.ll Clearly, for each p E 7^„,fc, 

v\p\ 

Let us now find an upper bound on the size of TZn,k- The end points of a 
path can be chosen in at most Cn"^ ways. Given the end points, in view 
of the restriction (j33p and the directed nature of the paths, we can employ 
the same counting argument as in the proof of Lemma 18.41 to conclude that 
the number of paths is < {Cn)^ . Thus, |7^n,fc| ^ n^iCn)^ . The claim now 
follows by Lemma |2. II □ 

Lemma 8.8. For any w £ Z^, we have 

E{M-M^f < C{\w\'^ + \w\^/^) log n. 

Proof. As before, let k = \w\. Since E(M^) is bounded by Cn^ (easily 
verified by Lemma l2.ip . we can assume without loss of generality that k < 
c{n — 1). Let P' and L be defined as in Lemma [8.61 Denote the components 
of Vi by and v^. Then we know that either u£ = or = . Define 
a directed path P" = (mQ) • • • i Un-i) as follows. If f£ = , let uq = w and 
let Ui be the point immediately above nj_i for 1 < i < L — k. li v'f^ > 
and = , let uq = w and let ui be the point immediately to the right of 
Ui-i for 1 < i < L — k. Note that in either case, we end up with u^^-fc = vl 
because If^l = L. Thereafter, merge the path with P' , that is, let 

Ui = Vk+i for L — k + l<i<n — 1. 
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Let TZn,k and Rn^k be defined as in Lemma 18.71 above. Clearly, the paths 
(fo, • • • , fminlL.n-i}) and (mq, • • • , UL-k) belong to TZn,k- With this observa- 
tion, and the fact that P" is a directed path of length n starting at w, we 
have 

n— 1 L—k n+k—1 

j=0 j=0 i=L 

min{L,n-l} n+k-1 L-k 

= M - Y 9v,+ Yl 9v,+Y 
i=0 i=max{n,L} i=0 

> M — k — k max \gJ. 

\v\<2n 

Therefore, by the Cauchy-Schwarz inequality. Lemma |2.H Lemma |8.6|, and 
Lemma 18.71 it follows that 



(34) E{M - M^)l<C{\w\'^ + ^/^\w\)logn. 

Next, let us get a bound on E(M^ — M)^. Denote the vertices of by 
(vw^o, . . . , Vw,n-i)- Let (uq, . . . , be any fixed directed path with uq = 

and Uk-i = w. Let be the directed path 

(no, . . . , Uk-l,W + Vw,l,W + Vw,2-, ■■■,W + Vuj^n-k)- 

Then note that 

^ > 'Y ^ + ff^o H \- 9uk-2 - 9w+v^^„_k+i - ■■■ - gw+v^^„.i 

vGP' 

> Mw — 2(k — 1) max \gy\. 

\v\<2n 

This shows that 

(35) E(M^ -M)^ < C7|ii;|2logn. 

Combining (j34p and (j35p . our proof is done. □ 
Proof of Theorem \8.1[ For each w & B, we have 

K{M- M^f < C{\w\'^ + ./n\w\) log n< Cn^/^logn. 

Therefore, 

E(M-M)2 < Cn^/^logn. 
Thus, by Lemmas 18.21 and 18.81 we have 

Var(M) < E{M - E(M)f 

< 2E(M - Mf + 2Var(M) 

<Cn3/4logn + -^. 

log n 

This completes the proof. □ 
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9. Example: Generalized SK model of spin glasses 

Let n be a positive integer and let $]„ = {—1, 1}". Suppose we have n 
magnetic particles, each having spin +1 or —1. A typical element of cr = 
(cJi, . . . ,crn) € S„ is called a 'configuration' of spins. Let g = {gij)i<i<j<n 
be a collection of independent standard Gaussian random variables. Given 
a realization of g, define the energy of a configuration a as 

Hn{cr) = — -1= ^ gijCTiaj. 

l<i<jr<n 

The energy landscape so defined corresponds to the famous Sherrington- 
Kirkpatrick model of spin glasses [SS] in the absence of an external field. 
In the last thirty years, the SK model has been an inspiration for a large 
body of groundbreaking physics (see the Mezard-Parisi-Virasoro book [53j) 
as well as beautiful rigorous mathematics (see e.g. [2], [26], [I9j, [63j, [73j, 
[33] , [34] , [77] , [76] ) . An extensive collection of rigorous results can be found 
in Talagrand's book [7^, a new edition of which is in preparation. 

A natural variant of the SK model is the p-spin SK model, where the 
energy of a configuration is defined as 

(36) Hn,p{(T) = -—j—^ ^ 5iii2-ipf^n'7i2 • • • f^ip> 

l<ii,i2,...,ip<n 

where again, {gi^i2-ip)i<ix,...,ip<n is a fixed realization of i.i.d. standard 
Gaussian random variables. (Usually, the sum is taken over distinct ii, . . . ,ip. 
We take it over all ii, . . . , ip to avoid certain technical inconveniences.) The 
p-spin model was suggested by Derrida, and subsequently studied by Gross 
and Mezard [32j and Gardner [27J. 

A generalized version of the SK model that covers all p-spin models was 
considered by Talagrand in his celebrated paper on the Parisi formula [76]. 
It is simply a linear combination of the p-spin energies over all p and covers 
all cases considered till now. Given a sequence of non-negative real numbers 
c = (c2, C3, . . .) such that 

oo 
p=2 

define the energy function 

oo 

Hn,cH :=I]cy2i7„p(^), 

p=2 

where H^^p is the p-spin energy defined above in ()36p . Then the usual 
SK model corresponds to the sequence (1,0,0,...), and the j?-spin model 
corresponds to the sequence that has 1 at the pth position and elsewhere. 

The objective of this section is to analyze the energy landscape of the 
generalized SK model. In particular, we are interested in the behavior of 
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the ground state, i.e. the configuration with minimum energy, and the fluc- 
tuations of the energy of the ground state. 

There are two notable conjectures about the ground state of the SK model, 
both seemingly beyond the reach of current technology. One is that the 
ground state exhibits chaos (exactly in our sense) . A physics proof for chaos 
was given by McKay, Berger, and Kirkpatrick [50j. The second conjecture 
is about the 'fluctuation exponent' of the ground state energy. 

Definition 9.1. The energy of the ground state is said to have fluctuation 
exponent p if it has fluctuations of order n^^"^^^ . 

Classical results like Proposition 11.21 and Proposition 11.31 imply that for the 
generalized SK model, p < 1/2. It is predicted by physicists [ISl [22] that 
p = 1/6 in the SK model, although the prediction has not yet been reliably 
verified by simulations. 

As we know from our Theorem 11.81 the two problems are related. Indeed, 
superconcentration and hence chaos happens if the fluctuation exponent is 
anything strictly less than 1/2. The main result of this section is that the 
fluctuation exponent of the ground state energy in the generalized SK model 
is at most 3/8 if Cp decreases to zero sufficiently slowly as p — > oo. 

Theorem 9.2. Let I{x) := ^((1 + x) log(l + x) + {1 - x) log(l - x)), and 
(Cp)p>2 be constants such that for all x G (—1, 1), 

2 log 2 -/(x) ■ 

^ ' p=2 

Then c* > for allp and ^c*p = 1. Suppose c = (cp)p>2 is any non-negative 
sequence such that ^ Cp = 1, and for all r >2, 



(37) E^P^E 

p=2 p=2 



Then the ground state energy of the generalized SK model defined by the 
sequence c has fluctuation exponent < 3/8, and consequently, the ground 
state is chaotic. 

Incidentally, it was shown by Wehr and Aizenman [82] that the lattice 
spin glass (i.e. the Edwards- Anderson model) is not superconcentrated, and 
hence, not chaotic in our sense. Therefore, if we are looking for chaos in 
spin glasses, the only option is to look in mean-field models. 

The proof of Theorem 19.21 is based on our result about extremal Gauss- 
ian fields, namely. Theorem 15.11 The minor izing condition (j37p suffices to 
guarantee extremality of the energy landscape considered as a Gaussian 
field. We will actually prove a more general version of Theorem 19.21 with 
precise quantitative bounds. Fix n, and consider a centered Gaussian field 
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X = {Xa-)cre{-i,i}" satisfying 

Cov(X^,X^/) =^(^^-^^ for all a G {-1,1}'', 

where er a' = J27=i ^i^i usual inner product, and ^ : [—1, 1] — > [—1, 1] 

is a function with ^(1) = 1. Let X' be an independent copy of X. As usual, 
define the perturbed field 



X* = e-*X + Vl - e-2<X'. 



Let <T* = argmax^X^. The following theorem is a quantitative version of 
Theorem [ 



Theorem 9.3. Let I{x) be as in Theorem \9.2[ Suppose that 

(38) |e(x)| < e(|x|) and ^(x) < ^^J^""^ j^^^ for all x G (-1,1). 

Then we have the bounds 

(log n \ 
, and 
n J 



n )) ri 

where C is a universal constant. 

Before proving Theorem 19.31 l^t us first show that it implies Theorem 
First of all, if we define X^r = —n~^/'^Hn,c{cr), then 



cov(x^,x^o = E^pr~^ 

p=2 V / 



Thus, we are in the setting of Theorem 19.31 with ^{x) = Y^CpX^. So if we 
can show that (138p follows from ()37p , then Theorem 19.31 would imply that 
Var(maxcr -^<t) ^ n^^/'^+''W and hence that 

Var(minF„c(o-)) < n^^^+°^^\ 

cr ' 

which proves the claim. The implication of (j38p from (j37p is proved as 
follows. First, it is easy to verify that the power series for I(x) has non- 
negative coefficients, and therefore so does 



21og2-/(x) ^V21og2 
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For each r, let Cr = X]p=2'^P' — Sp=2'^p' with Ci = = 0. The 

assumption ([57|) says that Cj. < C* for each r. Thus for any x £ (0, 1), 

oo oo 

p=2 p=2 

oo 

p=2 



2 log 2 - I(x) 

p=2 p=2 ^ ^ ' 

Since < and / is symmetric, the inequality holds for x E (—1,0] 

as well. This completes the argument for Theorem 19.21 



Proof of Theorem \9.3[ We use Theorem ll.lll Let 1 denote the configuration 
of all I's. Then by symmetry, we have 

<T,o-'e{-i,i}" (Te{-i,i}" n 

By the binomial theorem, we know that the number of configurations that 
have ai = k is exactly 



n 

n+k 
2 



which is interpreted as zero if k and n have different parity. Now, we have 
that for any p £ [0, 1], 

\ (n+fc)/2^^ _^^)(r^-fc)/2 < ^ 



n+k jP 
2 

Taking p = (n + k) /2n, we get 

" ^ < „n(log2-7(fc/n)) 
n+k ) - ^ ' 
2 / 

where I{x) is defined in the statement of the theorem. Again, the hypothesis 
implies that 

2 log 2 



i + e(x) 

Thus, we have 



>21og2-/(x) for ah 



< Cn. 

The proof now follows from Theorem 11.111 □ 
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10. Example: The Discrete Gaussian Free Field 

In this section, we show that the Discrete Gaussian Free Field (DGFF) 
on an n X n grid (defined below) is a superconcentrated Gaussian field. The 
massless Gaussian free field is an important mathematical object, inspiring 
a substantial amount of rigorous literature. It is essentially a higher dimen- 
sional analog of Brownian motion, where the dimension of time (rather than 
space) is higher than one. Although initially introduced as a toy model for 
the Ising interface, the topic has grown in its own right and has found impor- 
tant intersections with subjects as diverse as quantum gravity and stochastic 
Loewner evolutions. The DGFF is a finite approximation to the massless 
free field, just as random walk is a finite approximation of Brownian motion. 
For further motivation, definitions, and a review of the rigorous literature, 
we refer to Giacomin [28j and the excellent recent survey of Sheffield [64j . 

10.1. Zero boundary condition. Let K := {0, . . . , n - 1}^, and dVn be 
the inner boundary, that is, the points in which have a nearest neighbor 
outside. Let int(T4,) := Vn\dVn- The two-dimensional discrete Gaussian 
free field on Vn with zero boundary condition is defined as a family — 
{'Px}x(^V„ of centered gaussian random variables with covariances given by 
the discrete Green's function of the (discrete) Laplacian on int(T^). This 
means, explicitly, that c/ia; = for x G dVn, and 

(39) Cov((/)^,(/)y) = G„(x,y) = E^f ^ j, x,yeint(F„), 

where {r/i}i>o is a standard symmetric nearest neighbor random walk on the 
two-dimensional lattice Z^, starting at x, and tqv„ is the first entrance time 
in dVn- The law of is the Gaussian distribution with density function 
proportional to 

(40) e^v{-\Y.^ct>^-^yf)^ 

where x ~ y means x and y are neighbors in Vn (each pair counted once), 
with the understanding that we set (\)x = ^ for x E dVn in the above formula. 
This can be seen as follows. Fix y S int(y„), and for each x G int(K„) let 
fix) and g{x) denote the left and right sides of (|39]l . Using (HOj) it is easy 
to show that for any x, 

^{ci>x\{'^z).^x) = \ 

It follows that the function / is discrete harmonic on int(14i)\{y}, that is, 
/(x) = E(E((/<,|(0,),^,)0,) = ^ ^(^)- 

zeVn, Z^X 
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Similarly, it is easy to show that g{x) is discrete harmonic on mt{Vn)\{y} 
by conditioning on rji. Again, putting 

4>y = ^{4>y\{4>z)z^y) = ^ ^ 4>z, 

we have 

f{y) -\ E /(^) = ^(('^s/ - ^v)^y) 

z&Vn, zr^y 

= E{{cl>y-^yf)+Ei{<Py-4>y)^y) 

= E(Var(0j,|(0^)^^j,)) + O 
= 1. 

Similarly, conditioning on r/i, we can show 

9{y) - ^ Yj = ^• 

zeVn, zr^y 

Thus, f — g is discrete harmonic on int(V^). But / = 5 = on dVn- Thus 
we must have f = g everywhere on Vn, which proves that the density (j40p 
indeed corresponds to the Gaussian field with covariance (j39p . 

It was shown by Bolthausen, Deuschel, and Giacomin Lemma 1) 

that as n — > 00, we have 

2 

max Var = — logn + 0(l). 
y&V„ IT 

The unexpected and surprising fact, also in the same paper ([TT], Theo- 
rem 2), is that as n — > 00, 

[2 

E(max ~ 2a/ — logn, 
j/gV„ V vr 

exactly as if {(py}y£V„ were independent. Here a„ ~ 6„ means, as usual, 
that lim„^oo CLn/i>n = 1- In our terminology, the DGFF with zero boundary 
condition is extremal. Combined with Theorem 15. H a direct consequence is 
the following result. 

Proposition 10.1. The discrete Gaussian free field on an n x n grid with 
zero boundary condition is superconcentrated (as n ^ 00), meaning that 

Var(max(/)y) = o(logn). 

y&Vn 

Consequently, the two dimensional DGFF is chaotic and has the multiple 
peaks property. 

An immediate comment is that we cannot give a bound on the variance 
better than o(logn). This is because the result about the asymptotic be- 
havior of E(max(/)y) available from [TT] does not include any explicit rate 
of convergence, which makes it impossible for us to get a more informative 
bound from Theorem 15.11 
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We next consider the free field on an n x n torus, where we can take 
advantage of the symmetry to prove that Var(max(/)(y)) = O(loglogn). 

10.2. DGFF on a torus. Let T„ be the set {0, . . . ,n — 1}^ endowed with 
the graph structure of a torus, that it, (a, b) and (c, d) are adjacent if a — c = 
±1 (mod re) and b — d = ±1 (mod n). We wish to define a Gaussian free 
field on this graph. However, the graph has no natural boundary. The 
easiest (and perhaps the most natural) way to overcome this problem is to 
modify the definition ()39p of the covariance by replacing the stopping time 
tqv„ with a random time r that is of the same order of magnitude as tqv„ , 
but is independent of all else. Specifically, we prescribe 

(41) Cov{^„(t>y)=Ejj2hri^=y})^ 

where {rii)i>o is a simple random walk on the torus started at x, and r is 
a random variable independent of {rji)i>o, which we take to be Geometri- 
cally distributed with mean n^. The reason for the particular choice of the 
Geometric distribution is that it translates into a simple modification of the 
density (|10]) by the introduction of a small 'mass'; the new density function 
turns out to be 

(42) exp(-il^5:(0. -^,)^ -^T.^^)- 

^ x^y X ' 

Here x ~ y means that x and y are neighbors on the torus. To show that 
this density indeed corresponds to that of a centered Gaussian field with 
covariance (|4ip . we proceed as in the case of the DGFF with zero boundary 
condition. Fix y G T„, and let /(x) and g[x) be the two sides of (j4ip . From 
()4ip and (j42p . one can check using conditional expectations that for x ^ y, 

(The second identity holds because conditional on r > 1, r — 1 has the same 
distribution as r. This is where we use that r has a Geometric law.) Again, 
using similar computations as before, it can be checked that 

zeVn, Z^X 

z£Vn, z^y 

Combining the last two displays, one can conclude that \ f — g\ is a nonnega- 
tive strictly subharmonic function on T„, which implies that it must be zero 
everywhere. 

Although we assign a small mass in our definition of the DGFF on the 
torus, we can still call it a 'massless free field' in an asymptotic sense because 
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the stopping times tqy^ and r are both of order n? and it is not difficult to 
show that the covariances in the two models differ by 0(1). 

Let us now state the main result of this section, which shows that the 
DGFF on the torus is superconcentrated, with an explicit bound of order 
log log n on the variance of the maximum. 

Theorem 10.2. Let {(l)x)x&Tn be the DGFF on the torus defined above. For 
some universal constant C , we have 

Var(max02:) < Clog log n. 

The proof of this result is via the use of hypercontractivity, more specifically 
Theorem 14.31 The key advantage in the torus model is that we know that 
the location of the maximum is uniformly distributed. In the zero boundary 
situation, we had very little information about the location of the maximum. 

Let us now proceed to prove Theorem 110.21 The first step is a basic 
observation about the simple symmetric random walk on Z. 

Lemma 10.3. Suppose (ai)i>o is a simple symmetric random walk on Z, 
starting at 0. Then for any k £ Z and i > 1, we have 

P(ai = k)< 



where C is a universal constant. 

Proof. If I A; I > i or k ^ i (mod 2), then P(aj = k) = and we have nothing 
to prove. If |fc| = i, we have W(ai = k) = 2~*, which is consistent with the 
statement of the lemma. In all other cases, 



P(«. = k) = (.^ 

V 2 



2 

Using the Stirling approximation, we get 

C ( i + /c + l, / k\ f-Zc + l, / k 

P(a. = fc) < ^ exp ^ ^ lo^l + - J Y- - 7 



where 



1 ~|~ X 1 — X 



The function 1 has a power series expansion 

In particular, l{x) > 12. This inequality suffices to prove the lemma 
when, say, k < i/2. On the other hand, if i/2 < k < i and i is so large that 
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logi < i/8 (which can be assumed by choosing a suitably large C), we have 



1 , / k'^\ log i i k"^ 



which implies that 



This completes the proof. □ 

We are going to use random walks on J? to produce random walks on T„. 
For this purpose, we observe that there is a natural map Q„ : — > T„ 
which takes a point (xi, 2:2) G 1? to the unique point (x'^, x'2) in T„ satisfying 
xi = x'l (mod n) and X2 = x^ (mod n). For any x,y £ T„, define the toric 
Euclidean distance dn{x,y) as: 

dn{x,y) := d{x,Q~^{y)), 

where d{x, A) is the usual Euclidean distance of a point x from a set A. 
It is not difficult to verify that actually dn{x,y) = d{Q~^ (x) , (y)) and 
therefore the definition is symmetric in x and y. 

Lemma 10.4. Let (r/j)j>o be a simple symmetric random walk on T„. Then 
for any x, y G T„ and any i > 1, we have 



~{m = y)< 



Cn""^ if i > r?, 



where C is a universal constant and denotes the law of the random walk 
starting from x. 

Proof. Let {(3i)i>o be a simple symmetric random walk on Z^. Then a ran- 
dom walk on the torus is easily obtained as r]i = QniPi)- For any x,y G T„, 
we have 

(43) r^{rji = y)= ^ F^{Pi = z). 

Now, for any x and z, Lemma 110.31 shows that 

(44) P.(A = ^)<-expf "^^""''^ 



i V 4i 

where d{x, z) is the Euclidean distance between x and z. Now fix x, y G T„, 
and let z be the nearest point to x in Qn^{y). Then, if x = (xi,X2) and 
z = (zi, Z2), we have by dS]) and (04]) that 

FAri^ = y)<j Y: exp(-(^^-"^ + ^^")' + (^^~"^ + ^^^)\ 
* fci,fe2ez ^ * / 
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It is easy to see that \xj — Zj\ < n/2 for j = 1, 2 (otherwise, we can choose 
a 'better' zj so that z is closer to x.) Thus, for j = 1,2, 

{xj — Zj + kju)^ = {xj — Zj)'^ + kju'^ + 2{xj — Zj)kjn 

> {xj - Zjf + \kj\{\kj\ - l)n2. 

Therefore, 

^ ^ k=l r,s=l 

Comparing the last two terms with integrals, we get 
^xiVi = y)< -e-'^^"'")'/^* (l+ [ e-"'"'/^'dn + / e-("'+^')"'/4^ du dv 



n -n? 

It is not difficult to verify by considering the cases i < n? and i > in? that 
this completes the proof. □ 

Lemma 10.5. For any x 7^ y G T„, we have 

Tl 

< Cov((/.,, 0^) < Clog — + C. 

dn{x, y) 

and Yar^cpx) < Clogn, where C is a universal constant. 

Proof. From the representation (j4ip , we see that the covariances are nonneg- 
ative. Now fix two distinct points x,y £ T„, and let d = dn{x, y). From (f^TI) . 
we have 

00 

Cow {<i)x,(t>y) = Y,^x{m = y)Hr > i) 

i=0 

Clearly, the second sum can be bounded by a constant that does not depend 
on n. For the first, note that by the inequality e~^ < x~^ that holds for 
X > 1, we have 

j ^—^ i ^—^ i 

l<i<n2 l<i<d2 £i2<;j<n2 

d^+ - 

< C + C(logn2 -logd^). 
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The bound on the variance follows similarly. This completes the proof. □ 

Proof of r/ieorem 110.21 Fix some r < Clogn, where C is the same constant 
as in Lemma 110.51 If Cov{(j)x, (j)y) > r, then by Lemma 110.51 

(45) dn{x,y) <ne^~^'''^^ =: s. 

Let A be an s-net of T„ for the metric dn (i.e. a set of points that are mutually 
separated from each other by distance > s, and is maximal with respect to 
this property). Let C(r) be the collection of all 2s-balls around the points 
of A. Then by (j45p we see that whenever Cov{<j)x, (py) ^ we must have 
that x,y £ D for some D G C(r). By symmetry, we see that the probability 
of the maximum being at any point z G T„ is exactly n~^. Therefore, 
the probability of the maximum being in any given D £ C(r) is bounded 
by Ks'^/n?, where if is a universal constant. Thus, in the terminology of 
Theorem 1131 

p{r) < ^ = if'e-2''/c^ 

where K' = Ke^. We can assume that K' > 1. Again, if x £ T„ and 
D G C(r) contains x, then the center of D must be at distance < 2s from x. 
Now, the centers of the members of C{r) are separated by distance > s from 
each other. Clearly, the maximum number of s-separated points in a 2s-ball 
can be bounded by a universal constant that does not depend on n and 
s. It follows that the number of members of D that contain x is bounded 
by a universal constant. Therefore, in the notation of Theorem 14.11 p{r) 
is bounded by a universal constant. Using the bounds on p{r) and fj,{r) in 
Theorem 14.31 and the inequality 1 — x < —logx for x > 0, we see that for 
some universal constant k, 

/■Clogn ^ 

Var(max02,.) < ClogK'+ / - — — :— — — dr. 



Icio^K' {2r/C)-\ogK' 

It is easy to see that the right hand side is bounded by a constant multiple 
of log log n. This completes the proof. □ 

11. Example: Gaussian fields on Euclidean spaces 

Consider a stationary centered Gaussian process {Xn)n>o- If we have 
Cov(Xo,X„) decaying to zero faster than 1/logn as n — > oo, then it is 
known at least since the work of Berman [8j that M„ := maxjXo, . . . , X„} 
has fluctuations of order (logn)~^/^ and upon proper centering and scaling, 
converges to the Gumbel distribution in the limit. This result has seen 
considerable generalizations for one dimensional Gaussian processes, both 
in discrete and continuous time. Some examples are [60], [61], [62], and [54j . 
For a survey of the classical literature, we refer to the book |46j . 

The question is considerably harder in dimensions higher than one. A large 
number of sophisticated results and techniques for analyzing the behavior of 
the maxima of higher dimensional smooth Gaussian fields are now known; 
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see the excellent recent book of Adler and Taylor [Ij for a survey. Here 
'smooth' usually means twice continuously differentiable. However, in the 
absence of smoothness, maxima of high dimensional Gaussian fields are still 
quite intractable. If only the expected size of the maximum is of interest, ad- 
vanced techniques exist (see the book on chaining by Talagrand [75]). The 
question of fluctuations is much more difficult. In fact, for general (non- 
smooth) processes, even the one-dimensional work of Pickands |60l \6T\ [62] 
is considerably nontrivial. 

One basic question one may ask is the following: what is a sufficient 
condition for the fluctuation of the maximum in a box of side length T to 
decrease like (logT)~^/^? In other words, when does the maximum behave 
as if it were the maximum of a collection of i.i.d. Gaussians, one per each unit 
area in the box? Classical theory [5l] tells that this is true for stationary 
one-dimensional Gaussian processes whenever the correlation between Xq 
and Xt decreases at least as fast as 1/ log T. Note that this requirement 
for the rate of decay is rather mild, considering that it ensures that the 
maximum behaves just like the maximum of independent variables. 

In the dimensions > 2, the above question is unresolved. Here questions 
may also arise about maxima over subsets that are not necessarily boxes. 
Moreover, what if the correlation decays slower than 1/logT? In this sec- 
tion, we attempt to answer these questions. Our achievements are modest: 
we only have upper bounds on the variances. The issue of limiting distribu- 
tions is not solved here. 

Let X = (-^u)ueM'' be a centered Gaussian field on R"' with E(X^) = 1 
for each u. For any Borel set A C M"^, let 

M{A) := supXu, m{A) := E{M{A)). 

For any u G and r > 0, let B(u, r) denote the open ball of radius r and 
center u. Assume that 

(46) L := sup m(5(u, 1)) < oo. 

Note that in particular, the above condition is satisfied when the field is 
stationary and continuous. Next, suppose cj) : [0, oo) — > M is a decreasing 
function such that for all u, v G W^, 

Cov(X„Xv) <(/<(|u-v|), 

where |u — v| denotes the Euclidean distance between u and v. Assume that 

(47) lim 0(s) = 0. 

For a set A C M"^ and e > 0, let N{A, e) be the maximum number of points 
that can be found in A such that any two them are separated by distance 
greater than e (such a collection is usually called an e-net of A). When 
e = 1, we simply write N{A) instead of A^(^, 1). The following theorem is 
the main result of this section. 
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Theorem 11.1. Assume (H6]l and (|^ . Then for any Borel set A (ZW^ 
such that diam(j4) > \, we have 

Var(M(A)) < Ci(0,d)(0(iV(A)^^(*''^)) + ^^^^ 

where Ci{(f),d) and C2{4>,d) are constants that depend only on the function 
(j) and the dimension d (and not on the set A), and N[A) is defined above. 

Remarks. The first observation is that if (j){s) decreases at least as fast as 
1/logs, then the above result clearly shows that 

Var(M(A))< ^^"^'^^ 



log N{A) 



where C(0, d) is some constant that depend only on and d. In particular, 
it gives a broad generalization of the classical results about fluctuations in 
one dimension [HI [60l [611 [62| IM] . An additional observation is that the first 
term in the bound can dominate the second only if </>(s) decreases slower 
than 1/ log s. 

Before we embark on the proof of Theorem Ill.H we need some simple 
upper and lower bounds on the expected value of M(A). 



Lemma 11.2. Under (146 p and (1471) . for any Borel set ^ C such that 
N{A) > 2, we have 



ci (0, d) ^/\ogN[A) <m{A)< C2 (</>, d) ^\ogN{A), 

where ci{(j),d) and C2{(f>,d) are positive constants that depend only on the 
function (j) and the dimension d. 

Proof. The upper bound follows easily from a combination of assumption (I46p . 
the tail bound from Proposition 11.31 for the maximum of the field in unit 
balls, and an argument similar to the proof of Lemma |2. 11 

For the lower bound, first let s > 1 be a number such that (f){s) < 1/2. 
Such an s exists by the assumption that lim^^oo </'(s) = 0. Next, let B be 
a 1-net of A and D be an s-net of B. For each x G D, the s-ball around x 
can contain at most k points of B, where A; is a fixed number that depends 
only on s and the dimension d. Thus, 

\B\ N{A) 



D > 



k k 



Since < 1/2 and lE(^u) = 1 for each u, by the Sudakov minoration 
technique from Section [5] we have 



m 



{A)>m{D) > Cy/log\D\ > ci{(b,d)y/logN{A), 



where D is a universal constant and ci{(/),d) is a constant depending only 
on the function 6 and the dimension d. □ 
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Proof of Theorem Ill.lL Let ci and C2 be the constants from Lemma 111.21 
Put 

and assume that N(A) is so large that s > 2. Let r = (j){s). Take any 
maximal s-net of A, and let C(r) be the set of 2s-balls around the points in 
the net. It is easy to verify using the definition of s and the decreasing nature 
of (p, that C(r) is a covering of A satisfying the conditions of Theorem 14.31 
Now take any D £ C(r). Since D is a 2s-ball and s > 2, by Lemma lll.2l we 

have 

m{D) < C2x/log 2s < C2V21ogs = jy^logN{A). 

Also, by Lemma [11.21 we have m{A) > ci y^log N{A). Thus, using the 
notation of Theorem 14. 3[ we have by Proposition 11.31 that 



+ F{M{A) < m{A) - 
{m{A)-m{D)) 



2 

m{A) - m{D) 



< 2exp 

Since the above bound does not depend on D, it serves as a bound on p{r). 
Let us now get a bound on /u(r). Take any u £ A. Then the center of any 
D G C(r) that contains u is a point in B(u,2s). The centers are mutually 
separated by distance more than s. Hence, the number of G C(r) that 
can contain u is bounded by N{B{u,2s), s), which, by scaling symmetry, 
is equal to N{B{0,2), 1). Since fi{r) is the expected number of elements of 
C(r) that contain the maximizer of X in A, therefore we have that fi{r) < C3, 
where C3 is a constant that depends only on the dimension d. Combining 
the bounds, we see that whenever N{A) > C4, we have 



|logp(r)| ~ logA^(^)' 

where C4 and C5 are constants depending only on and d. Note that we have 
this bound only for one specific value of r defined above. Now, if r' > r, and 
we define C(r') the same way as we defined C(r), then clearly C(r') would 
also be a cover of A satisfying the requirements of Theorem 14.31 and we 
would have p{r') < p{r). Noting this, and the fact that |(1 — x)/loga;| < 1 
for all X S (0, 1), we have by Theorem 14.31 that 

1 



for some constant cg depending only on (p and d. Of course this holds only if 
N(A) > C4, but this condition can now be dropped by increasing the value 
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of cq, since we always have Var(M(A)) < 1. Plugging in the value of r, the 
proof is done. □ 

12. Open problems 

In some sense, this paper raises more questions than it solves. Some of 
these problems are listed below. At present, the author does not know how 
to solve any of these. 

(1) Prove chaos in the original Sherrington-Kirkpatrick model, where 
^(.t) = x'^. This is perhaps the most important open question that 
may be solvable by suitable extensions of the methods of this paper. 

(2) Prove chaos in directed polymers in dimensions > 3. Certain parts 
of the current proof do not work in higher than 2 dimensions, but 
other parts are okay. Since the Benjamini-Kalai-Schramm approach 
is inherently dimension-free, this seems to be a more doable open 
problem. 

(3) Improve the fluctuation exponent in directed polymers. The current 
proof via hypercontractivity gives only a log n correction. Although 
this suffices to prove chaos, it is not satisfactory. 

(4) Improve the variance bound in the discrete Gaussian free field with 
zero boundary condition. Although the case of the torus is interest- 
ing, the zero boundary condition is the more important one. Also, 
even in the torus, is log log n the correct order? The author thinks 
0(1) is more likely to be the right answer. 

(5) Show that chaos implies multiple peaks even when the correlations 
are not all nonnegative. This involves getting a bound on the second 
moment of R{I^, /*), which we do not know how to derive at present. 

(6) Find a suitable multiple peaks condition that is actually equivalent 
to chaos and superconcentration. There is a possibility that there 
does not exist any such condition. 

(7) Find a more intuitive and less analytical (rigorous) proof of the 
equivalence of superconcentration and chaos. 

(8) A very interesting question in the context of multiple peaks is the is- 
sue of 'bridges': do the multiple peaks exist as disconnected islands, 
or do there exist 'bridges' that allow one to move from one peak to 
another without 'climbing down' ? This question is particularly rele- 
vant for the Kauffman-Levin fitness model, for obvious evolutionary 
implications. Our method of proving the existence of multiple peaks 
suggests an obvious way to prove the existence of bridges, but the 
author does not yet know how to carry out the program for the NK 
model. 
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